Database Security

Database Security

t GUNTHER PERNUL Institute of Applied Computer Science Department of Information Engineering University of Vienna 1. Introduction 2. 3. 4. 5. 6. 7...

4MB Sizes 15 Downloads 137 Views

t GUNTHER PERNUL Institute of Applied Computer Science Department of Information Engineering University of Vienna 1. Introduction

2.

3.

4.

5. 6. 7.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

I .1 The Relational Data Model Revisited . . . . . . . . . . . . . . . . . 1.2 The Vocabulary of Security and Major Database Security Threats . . . . . Database Security Models . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Discretionary Security Models . . . . . . . . . . . . . . . . . . . . . 2.2 Mandatory Security Models . . . . . . . . . . . . . . . . . . . . . . 2.3 The Adapted Mandatory Access Control Model . . . . . . . . . . . . . 2.4 The Personal Knowledge Approach . . . . . . . . . . . . . . . . . . 2.5 The Clark and Wilson Model . . . . . . . . . . . . . . . . . . . . . 2.6 A Final Note on Database Security Models . . . . . . . . . . . . . . . Multilevel Secure Prototypes and Systems . . . . . . . . . . . . . . . . . . 3.1 SeaView . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Lock Data Views . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 ASD-Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conceptual Data Model for Multilevel Security . . . . . . . . . . . . . . . 4.1 Concepts of Security Semantics . . . . . . . . . . . . . . . . . . . . 4.2 Classification Constraints . . . . . . . . . . . . . . . . . . . . . . . 4.3 Consistency and Conflict Management . . . . . . . . . . . . . . . . . 4.4 Modeling the Example Application . . . . . . . . . . . . . . . . . . Standardization and Evaluation Efforts . . . . . . . . . . . . . . . . . . . Future Directions in Database Security Research . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 6 8 9 11 19 33 35 37 38 39 41 43 45 47 50 57 58 62 65 68 69

1 . Introduction Information stored in databases is often considered a valuable and important corporate resource . Many organizations have become so dependent on the proper functioning of their systems that a disruption of service or a leakage of stored information may cause outcomes ranging from inconvenience to catastrophe . Corporate data may relate to financial records; may be essential to the successful operation of an organization. may represent trade secrets. or may describe information about persons whose privacy must be protected . Thus. the general concept of database ADVANCES IN COMPUTERS. VOL . 38

1

.

Copyright 0 1994 by Academic Press Inc. All rights of reproduction in any farm reserved. ISBN 0-12-012138-7

2

GUNTHER PERNUL

security is very broad and embraces such areas as the moral and ethical issues imposed by public and society and legal issues in which laws are passed regulating the collection and disclosure of stored information, or more technical issues such as ways of protecting stored information from loss or unauthorized access, destruction, use, modification, or disclosure. More generally, database security is concerned with ensuring the secrecy, integrity, and availability of data stored in a database. To define our terms, secrecy denotes the protection of information from unauthorized disclosure either by direct retrieval or indirect logical inference. In addition, secrecy must deal with the possibility that information may also be disclosed by legitimate users acting as an “information channel” by passing secret information to unauthorized users. This may be done intentionally or without the knowledge of the authorized user. By integrity we understand the need to protect data from malicious or accidental modification, including insertion of false data, contamination of data, and destruction of data. Integrity constraints are rules that define the correct states of a database and thus can protect the correctness of the database during operation. By Availability we understand the characteristic according to which we may be certain that data are available to authorized users when they need them. Availability includes the “denial of service” of a system, as occurs when a system is not functioning in accordance with its intended purpose. Availability is closely related to integrity because “denial of service” may be caused by unauthorized destruction, modification, or delay of service as well. Database security cannot be seen as an isolated problem as it is influenced by the other components of a computerized system. The security requirements of a system are specified by means of a security policy that is then enforced by various security mechanisms. For databases, the security requirements can be classified in the following categories: 0

0

Identification, Authentication. Usually, before gaining access to a database, each user has to identify himself to the computer system. Authentication is a way of verifying the identity of a user at log-on time. Most of the common authentication methods are passwords but more advanced techniques like badge readers, biometric recognition techniques, or signature analysis devices are also available. Authorization, Access Controls. Authorization consists in the specification of a set of rules that declare who has a particular type of access to a particular type of information. Authorization policies, therefore, govern the disclosure and modification of information. Access controls are procedures that are designed to control authorization by limiting access to stored data to authorized users only.

DATABASE SECURITY

0

0

3

Integrity, Consistency. An integrity policy gives a set of rules (i.e., semantic integrity constraints) that define the correct states of the database during database operation and, therefore, can protect against malicious or accidental modification of information. Closely related issues are concurrency control and recovery. Concurrency control policies protect the integrity of the database in the presence of concurrent transactions. If these transactions do not terminate normally due to system crashes or security violations, recovery techniques may be used to reconstruct correct or valid database states. Auditing. The requirement to keep records of all security-relevant actions issued by a user is called auditing. The resulting audit records are the basis for further reviews and examinations in order to test the adequacy of system controls and to recommend changes in a security policy.

In this chapter our approach will not involve this type of broad perspective of database security. Instead, the main focus will be on aspects of authorization and access controls. This is a legitimate concern, since identification, authentication, and auditing’ normally fall within the scope of the underlying operating system and integrity and consistency policies are subject to the closely related topic of “semantic data modeling” or are dependent on the physical design of the database management system (DBMS) software, namely, the transaction and recovery manager. Because most research in database security has concentrated on the relational data model, the discussion in this chapter will focus on the framework of relational databases. However, the results described may generally be applicable to other database models as well. For an overall discussion on basic database security concepts consult the surveys by Jajodia and Sandhu (1990a), Lunt and Fernandez (1990), and Denning (1988). For references to further readings consult the annotated bibliography compiled by Pernul and Luef (1992). In the remainder of the opening section we briefly review the relational data model, introducing a simple example that will be used throughout the chapter, present the basic terminology used in computer security, and describe the most successful methods of penetrating a database. Because of the diversity of application domains for databases different security models and techniques have been proposed so far. In Section 2 we review, evaluate, and compare the most prominent examples of these security models and techniques. Section 3 contains an investigation of secure (trusted) database management systems. By a secure DBMS we understand special-purpose

’ However, audit records are often stored and examined by using DBMS software.

4

GUNTHER PERNUL

systems that support a level-based security policy and are designed and implemented with the main focus on the enforcement of high security requirements. Section 4 focuses on one of the major problems of levelbased security-related database research. In this section we address the problem of classifying the data stored in a database so that the security classifications reflect the security requirements of the application domain proper. What is necessary here is to have a clear understanding of all the security semantics of the database application and an appropriate clever database design. A semantic data/security model is proposed in order to arrive at a conceptualization and clear understanding of the security semantics of the database application. Database security (and computer security in general) is subject to many national and international standardization efforts. These efforts are aimed at developing metrics for evaluating the degree of trust that can be placed in the computer products used in the processing of sensitive information. In Section 5 we briefly review these proposals. In Section 6 we point out research challenges in database security and attempt to forecast the direction of the field over the next few years. Section 7 concludes the chapter.

1.1

The Relational Data Model Revisited

The relational data model was invented by Codd (1970) and is described in most database textbooks. A relational database supports the relational data model and must have three basic components: a set of relations, a set of integrity rules, and a set of relational operators. Each relation consists of a state-invariant relational schema RS(A 1 , '...,A,,), where each Ai is called an attribute and is defined over a domain dom(Ai).A relation R is a state-dependent instance of RS and consists of a set of distinct tuples of the form ( a l ,...,a,,), where each element ai must satisfy dom(Ai) (i.e., ai E dom(Ai)). Integrity constraints restrict the set of theoretically possible tuples (i.e., dom(A,) x dom(A2)x x dom(A,,))to the set of practically meaningful tuples. Let X and Y denote sets of one or more of the attributes Ai in a relational schema. We say Y is functionally dependent on X , written X + Y,if and only if it is not possible to have two tuples with the same value for X but different values for Y.Functional dependencies represent the basis of most integrity constraints in the relational model of data. Since not all possible relations are meaningful in an application, only those that satisfy certain integrity constraints are considered. From the large set of proposed integrity constraints two are of major relevance for security: the key property and the referential integrity property. The key property states

DATABASE SECURITY

5

that each tuple must be uniquely identified by a key and a key attribute must not have the null value. Consequently, each real-world event can be represented in the database only once. Referential integrity states that tuples referenced in one relation must exist in others and is expressed by means of foreign keys. These two rules are application-independent and must be valid in each relational database. In addition, many application-dependent semantic constraints may exist in different databases. Virtual-view relations (or views) are distinguished from base relations. While the former are the result of relational operations and exist only virtually, the latter are actually present in the database and hold the stored data. Relational operations consist of the set operations, a select operation for selecting tuples from relations that satisfy a certain predicate, a project operation for projecting a relation onto a subset of its attributes, and a join operation for combining attributes and tuples from different relations. The relational data model was first implemented as System R by IBM and as INGRES at U. C. Berkeley. The two projects provided the principal impetus for the field of database security research and also considerably advanced the field as well as forming the basis of most commercially available products. A few words on the design of a database are in order. The design of a relational database is a complicated and difficult task and involves several phases and activities. Before the final relation schemas can be determined a careful requirements analysis and conceptualization of the database is necessary. Usually this is done using a conceptual data model powerful enough to allow the modeling of all application-relevant knowledge. The conceptual model is used as an intermediate representation of the database and ultimately transferred into corresponding relation schemas. It is very important to use a conceptual data model at this stage since it is only with such a high-level data model that a database can be created that properly represents all the application-dependent data semantics. The de facto standard for conceptual design is the Entity Relationship (ER) approach (Chen, 1976) or any one of its variants. In its graphical representation and in simplest form ER regards the world as consisting of a set of entity types (boxes), attributes (connected to the boxes), and relationship types (diamonds). Relationship types are defined between entity types and are either of degree ( l : l ) , ( l : n ) , or ( n : m ) . The degree describes the maximum number of participating entities. Following is a short example of a relational database. This example will be used throughout the chapter. It is a very simple example yet sufficiently complex for presenting many of the security-relevant questions and demonstrating the complexity of the field. Figure 1 contains a conceptualization of the database in the form of an ER diagram and corresponding

6

GUNTHER PERNUL

(m, (m,

Employee Name, Dep. Salary) Project Subject. Client) Date, Function) Assignment (-N,

FIG. 1. Representations of a sample database.

relational schemas (key attributes are underlined, foreign keys are in italics). The database represents the fact that projects within an enterprise are carried out by employees. In this simple example there are three security objects. First, Employee represents a set of employees each of which is uniquely described by a characteristic SSN (Social Security Number). Next are Name (of employee), Department (in which the employee is working), and Salary (of employee). Second, Project refers to a set of projects carried out by the enterprise. Each project has an identifying Title, Subject, and Client. Finally, the security object Assignment contains the assignments of employees to projects. Each Assignment is characterized by the Date of the Assignment and the Function the employee has to perform while participating in the project. A single employee can be assigned to more than one project and a project may be carried out by more than one employee.

1.2 The Vocabulary of Security and Major Database Security Threats Before presenting the details of database security research it is necessary to define the terminology used and the potential threats to database security. As we have already pointed out, security requirements are stated by means of a security policy which consists of a set of laws, rules, and practices that regulate how an organization manages, protects, and distributes sensitive information. In general, a security policy is stated in terms of a set of security objects and a set of security subjects. A security object is a passive entity that contains or receives information. It might be a structured concept like an entire database, a relation, a view, a tuple, an attribute, an attribute value, or even a real-world fact represented in the database.

DATABASE SECURITY

7

A security object might also be unstructured, such as a physical memory segment, a byte, a bit, or even a physical device like a printer or a processor. Please note that the term “object” is used differently in other areas of computer science. In the present context, security objects are the target of protection. A security subject is an active entity, often in the form of a person (user) or process operating on behalf of a user. Security subjects are responsible for making changes in a database state and causing information to flow within different objects and subjects. Most of the sources of threats to database security come from outside the computing system. If the emphasis is mainly on authorization, users and processes operating on behalf of users must be subject to security control. An active database process may be operating on behalf of an authorized user who has legitimate access or it may be active on behalf of an unauthorized person who has succeeded in penetrating the system. In addition, an authorized database user may act as an “information channel” by passing restricted information to unauthorized users either intentionally or without the knowledge of the authorized user. Some of the most successful database penetration methods are the following: 0

0

0 0

0

0

Misuses of Authority. Improper acquisition of resources, theft of programs or storage media, modification or destruction of data. Logical Inference and Aggregation. Both deal with users authorized to use the database. Logical inference arises whenever sensitive information can be inferred by combining less sensitive data. It may also involve certain knowledge from outside the database system. Closely related to logical inference is the aggregation problem, wherein individual data items are not sensitive though a sufficiently large collection of individual values taken together is sensitive. Masquerade. A penetrator may gain unauthorized access by masquerading as an authorized user. Bypassing Controls. These might be password attacks or exploitation of system trapdoors that get around intended access control mechanisms. Trapdoors are security flaws built into the source code of a program by the original programmer. Browsing. A penetrator may circumvent the protection and search through a directory or read dictionary information in an attempt to locate privileged information. Unless strict need-to-know access controls are implemented, the browsing problem becomes a major flaw of database security. Trojan Horses. A Trojan horse is hidden software that tricks a legitimate user into performing, unknowingly, certain actions which he

a

GUNTHER PERNUL

0

0

is not aware of. A Trojan horse may be hidden into a sort routine and be designed to release certain data to unauthorized users. Whenever a user activates the sort routine, for example, the purpose of sorting the result of a database query, the Trojan horse will act, using the users identity, and thus will have all the privileges of the user. Covert Channels. Usually the information that is stored in a database is retrieved by means of legitimate information channels. In contrast to legitimate channels covert channels are paths that are not normally intended for information transfer. Such hidden paths may either be storage channels like shared memory or temporary files that could be used for communication purposes or timing channels like a degradation of overall system performance. Hardware and Media Attacks. Physical attacks on equipment and storage media.

The attack scenario described above is not restricted to databases. For example, the German Chaos Computer Club succeeded in attacking a NASA system via a masquerade, by bypassing access controls (taking advantage of an operating system flaw) and using Trojan horses to capture passwords. As reported by Stoll(1988), some of these techniques were also used by the Wily Hacker. The Internet worm in 1988 exploited trapdoors in electronic mail handling systems and infected more than 5,000 machines connected to the Internet network (Rochlis and Eichin, 1989). Thompson (1984), in his Turing Award Lecture, demonstrated a Trojan horse placed in the executable form of a compiler that permitted the insertion of a trapdoor in each program compiled with the compiler. It is generally agreed that the number of known cases of computer abuse is significantly smaller than the number of actual cases since in this area there is hidden a large number of figures.

2.

Database Security Models

Because of the diversity of application domains for databases, different security models and techniques have been proposed to counter various threats against security. In this section we will discuss the most prominent among them. Put concisely, discretionary security specifies the rules under which subjects can, at their discretion, create and delete objects, and grant and revoke authorizations for accessing objects to other individuals. In addition to controlling access, mandatory security (or protection) regulates the flow of information between objects and subjects. Mandatory security controls are very effective but suffer from several drawbacks. One attempt

DATABASE SECURITY

9

to overcome certain limitations of mandatory protection systems is the Adapted mandatory access control (AMAC) model, a security technique that focuses on the design aspect of secure databases. The Personal knowledge approach concentrates on enforcing the basic law of many countries stating the informational self-determination of humans while the Clark and Wilson model attempts to represent common commercial business practice in a computerized security model. Early efforts at comparing some of these techniques were those of Biskup (1990) and Pernul and Tjoa (1992). Landwehr (1981) is a very good survey of formal policies for computer security in general, and Millen (1989) focuses on various aspects of mandatory computer security.

2.1

Discretionary Security Models

Discretionary security models are fundamental to operating systems and DBMSs and have been studied for some time. There was a great deal of interest in theoretical aspects of these models in the period from 1970 to 1975. Since that time most relational database security research has been focused on other types of security techniques. The appearance of more advanced data models has, nevertheless, renewed interest in discretionary policies.

2.1.1 Discretionary Access Controls Discretionary access controls (DAC) are based on a collection of concepts, including a set of security objects 0, a set of security subjects S, a set of access privileges T defining the kinds of access which a subject has to a certain object, and, in order to represent content-based access rules, a set of predicates P . Applied to relational databases 0, a finite set of values lo1,..., o n )is understood to represent relation schemas, S is a finite set of potential subjects (s, , ...,s,,,] representing users, groups of users, or transactions operating on behalf of users. Access types (privileges) constitute a set of database operations, such as select, insert, delete, update, execute, grant, or revoke and the predicate p E P defines the access window of a subject s E S on object o E 0. The tuple ( 0 , s, t , p ) is called an access rule and a function f is defined to determine if an authorization f ( o ,s, t , p ) is valid or not: f :0x S x T x P (True, False). -+

For any (0,s, t , p ) , iff(o, s, t , p ) evaluates True, subjects has authorization t to access object o within the range defined by predicate p .

10

GQNTHER PERNUL

An important property of discretionary security models is the support of the principle of delegation of rights, where a right is the (0, t,p)-portion of the access rule. A subject si who holds the right ( 0 , t, p) may be allowed to delegate that right to another subject sj (i # j ) . Most systems supporting DAC store access rules in an access control matrix. In its simplest form the rows of the matrix represent subjects, the columns represent the objects, and the intersection of a row and column contains the access type that that subject has authorization for with respect to the object. The access matrix model as a basis for discretionary access controls was formulated by Lampson (1971) and subsequently refined by Graham and Denning (1972) and by Harrison et al. (1976). A more detailed discussion on discretionary controls in databases may be found in the book by Fernandez et al. (1981). Discretionary security is enforced in most commercial DBMS products and is based on the concept of database views. Instead of authorizing access to the base relations of a system, information in the access control matrix is used to restrict the user to a particular subset of the available data. There are two principal system architectures for view-based protection: query modification and view relations. Query modification is implemented in Ingres-style DBMSs (Stonebraker and Rubinstein, 1976) and consists of appending additional security-relevant qualifiers to a user-supplied query. View relations are unmaterialized queries which are based on physical base relations. Instead of authorizing access to base relations, users are given access to virtual view relations only. By means of qualifiers in the view definition security restrictions can be implemented. View relations are the underlying protection mechanism of System R-based DBMSs (Griffiths and Wade, 1976). 2.1.2 DAC-Based Structural Limitations Although quite common, discretionary models suffer from major drawbacks when applied to databases with security-critical content. In particular the following limitations are encountered: 0

Enforcement of Security Policy. DAC is based on the concept of ownership of information. In contrast to enterprise models, where the whole enterprise is the “owner” of the information and responsible for granting access to stored data, DAC systems assign ownership of the information to the creator of data items in the database and allow the creator the authority to grant access to other users. This has the disadvantage that the burden of enforcing the security requirements of the enterprise becomes the responsibility of the users themselves and can be monitored by the enterprise only at great expense.

DATABASE SECURITY

0

0

0

11

Cascading Authorization. If two or more subjects have the privilege of granting or revoking certain access rules to other subjects cascading revocation chains may ensue. As an example, consider subjects sl, s2, and s3,and an access rule (sl, o, t, p ) . Subject s2 receives the privilege (0, t , p ) from s1 and grants this access rule to s3.Later, s1grants ( 0 , t , p) again to s3 but s2 takes the privilege (0, t , p ) away from s3 for some reason. The effect of these operations is that s3 still has the authorization (from sl) to access object o by satisfying the predicate p and using privilege t even though subject s2 has revoked this authorization. This has the consequence that subject s2 is not aware of the fact that the authorization (s3,o, t, p ) is still in effect. Trojan Horse Attacks. In systems supporting DAC the identity of the subjects is crucial. If actions can be performed by one subject using another subject’s identity, DAC can be subverted. By a “Trojan horse” is understood software that grants a certain right ( 0 , t , p) held by subject si to subject sj ( i # j ) without the knowledge of subject s i . Any program that runs on behalf of a subject acts under the identity of this subject and, therefore, possesses all the DAC access rights of the subject’s processes. If a program contains a Trojan horse that has the functionality of granting access rules to other users, this feature cannot be restricted by discretionary access control methods. Updating Problems. View-based protection results in unmaterialized queries which have no explicit physical representation in the database. This has the advantage of providing a high level of flexible support to subjects with different views and automatic filtering out of all data a subject is not authorized to access though it has the disadvantage of making it impossible to update all the data through certain views. This feature is a result of integrity factors that might be violated in data not contained in the view once the data from the view are updated.

2.2

Mandatory Security Models

Mandatory policies address a higher level of threat than do discretionary policies since, in addition to controlling access to data, they control the flow of data as well. Moreover, mandatory security techniques do not suffer from the structural limitations of DAC-based protection.

2.2.1 Mandatory Access Controls Whereas discretionary models are concerned with defining, modeling, and enforcing access to information, mandatory security models are, in addition, also concerned with the flow of information within a system.

12

GUNTHER PERNUL

Mandatory security requires that security objects and subjects be assigned certain security levels represented by a label. The label for an object o is called its classification (class(0))and a label for a subject s is called its clearance (cleafls)). The classification represents the sensitivity of the labeled data, while the clearance of a subject its trustworthiness to not disclose sensitive information to others. A security label consists of two components: a level from a hierarchical list of sensitivity levels or access classes (for example: top-secret > secret > confidential > unclassified) and a member of a nonhierarchical set of categories, representing classes of object types of the universe of discourse. Clearance and classification levels, are totally ordered, while the resulting security labels are only partially ordered; thus, the set of classifications forms a lattice. In this lattice security class c1 is comparable to and dominates (2)security class c2 if the sensitivity level of c1 is greater than or equal to that of c2 and if the categories in c, contain those in c, . Mandatory security grew out of the military environment where the practice is to label information. However, this custom is also common in many companies and organizations where labels such as “confidential” or “company confidential” are used. Mandatory access control (MAC) requirements are often stated following Bell and LaPadula (1976) and formalized in the following two rules. The first (simple property) protects the information of a database from unauthorized disclosure, and the second (*-property) protects data from contamination or unauthorized modification by restricting the information flow from high to low: 1. Subject s is allowed to read data item d if clear(s) 1 class(d). 2. Subject s is allowed to write data item d if cleafls) Iclass(d). A few final remarks on MAC policies are in order. In many discussions confusion has arisen concerning the fact that in mandatory systems it is not enough to have stringent controls over who can read which data. Why is it necessary to include stringent controls over who can write which data in systems with high security requirements? The reason is that a system with high security needs must protect itself against attacks from unauthorized as well as from authorized users. There are several ways authorized users may disclose sensitive information to others. This can happen by mistake, as a deliberate illegal action, or the user may be tricked into doing so by a Trojan horse attack. The simplest way in which information is disclosed by an authorized user occurs when information is retrieved from a database, copied into an “owned” object, and the copy then made available to others. To prevent an authorized user from doing so, it is necessary to control his ability to make copies (which implies the writing of data). In particular,

DATABASE SECURITY

13

once a transaction has successfully completed a read attempt, the protection system must ensure that no write to a lower-security level (write-down) could occur caused by a user who is authorized to execute a read transaction. As read and write checks are both mandatory controls, a MAC system successfully protects against attempts to copy information and grant copies to unauthorized users. By not allowing higher classified subjects the capability to “write-down” on lower classified data, the information flow among subjects with different clearances can be efficiently controlled. Inasmuch as covert storage channels require writing to objects, the *-property also helps limit leakage of information along such hidden paths. Mandatory integrity policies have also been studied. Biba (1977) has formulated an exact mathematical dual of the Bell-LaPadula model with integrity labels and two properties: no write-up in integrity and no readdown in integrity. This is, low-integrity objects (including subjects) are not permitted to contaminate objects of higher integrity, or, in other words, no resource is permitted to depend upon other resources unless the latter are at least as trustworthy as the former. As an interesting optional feature, mandatory security and the BellLaPadula (BLP) paradigm may lead to multilevel databases. These are databases containing relations which appear to be different to users with different clearances. This is accomplished by application of two policies, first by not allowing all clearances to authorize all subjects to all the data, and, second, by the fact that the support of MAC may lead to polyinstantiation of attributes or tuples. We will discuss polyinstantiation and the multilevel relational data model in more detail in the next subsection.

2.2.2 The Multilevel Secure Relational Data Model In this subsection we will define the basic components of the multilevel secure (MLS) relational data model. We will consider the most general case, i.e., the case in which an individual attribute value is subject to a security label assignment. We start by using the sample database scenario from the Introduction. Throughout the text, whenever the example is being referred the existence of four sensitivity levels, denoted TS, S, Co, and U (where TS > S > Co > U),and only one category is assumed. In each relational schema TC is an additional attribute and contains the tuple classification. Consider the three different instances of the relation “Project” given in Fig. 2. Figure 2(a) corresponds to the view of subject s with clear@) = S. Because of the simple property of BLP (read-access rule), users cleared at U see the instances of Project shown in Fig. 2(b). In this case the simple property of BLP automatically filters out data that dominate U.Consider further a subject s with clear@) = U and an insert operation in which the

14

GUNTHER PERNUL

’Title

Subject

Cliziit

IC

Alpha, S

Developmelit, S

A. S

S

Bela, U

Research. S

B, S

S

Celsius, 11

I’rocluctioii, 11

C, IJ

U

Chit

TC

Title

(a) Project s

TillC

Czlsiiir, IJ

Siiliject

I’raliiclioii, I J

C, IJ

(h) Project LJ Fra. 2. Instances of MLS relation “Project”.

user wishes to insert the tuple (Alpha, Production, 0)into the relation shown in Fig. 2(b). Because of the key integrity property, a standard relational DBMS would not allow this operation. (Although not seen by user s, as a key Alpha already exists in Project.) However, from a security point of view, the insert must not be rejected because otherwise there will be a covert signalling channel from which s may conclude that sensitive information he is not authorized to access may exist. The outcome of the operation is shown in Fig. 2(c) and consists of a polyinstantiated tuple in the MLS relation Project. A similar situation occurs if a subject cleared for the U-level updates (Beta, null, null) in Project as shown in Fig. 2(b) by replacing thc null values with certain data items. Again, this leads to polyinstantiation in Project. As another example of polyinstantiation, assume that subjects with cleur(s) = S wishes to update (Celsius, Production, C ) . In systems supporting MAC such an update is not allowed because of the *-property of BLP so as to prevent an undesired information flow between subjects cleared at the S-level to subjects cleared at the U-level. Thus, if an S-level subject wishes to update the tuple, the update again must result into polyinstantiation. The problem of polyinstantiation arises out of the need to avoid a covert channel. Lampson (1973) has defined a covert channel as a means of downward information flow. As an example let us consider the situation just described once again. If an insert operation initiated by some subject is rejected because of the presence of a tuple at a higher level, the subject

DATABASE SECURITY

15

might be able to infer the existence of that tuple, resulting in a downward information flow. With respect to security much more may happen that just inferring the presence of a tuple. The success or failure of the service request, for example, can be applied repeatedly to communicate one bit of information (0: failure, 1: success) to lower level. Therefore, the problem is not only that of inferring a classified tuple, moreover, any information visible at the higher level can be sent through a covert channel to the lower level. The theory of most data models is built around the concept that a real-world fact may be represented in a database only once. Because of polyinstantiation, this fundamental property is no longer true for MLS databases, thus requiring the development of a new theory. The state of development of MLS relational theory has been considerably advanced by research in the SeaView project (see Denning et al., 1988 or Lunt et al., 1990). The following discussion of the theoretical concepts underlying the MLS relational data model is based principally on the model developed by Jajodia and Sandhu (1991a). In the Jajodia-Sandhu model, each MLS relation consists of a stateinvariant multilevel relational schema RS ( A , C1, ...,A , , C, ,T C ) , where each A , is an attribute defined over a domain dom(A,), each Ci is a classification for A , , and TC is the tuple-class. The domain of C, is defined by [ L , ,H i ] which is a sublattice consisting of all security labels. The 1, i = l..n]], where lub resulting domain of TC is [IublL,,i = 1 . ~ ~lub(H,, denotes the least-upper-bound operation in the sublattice of security labels. In the Jajodia-Sandhu model TC is included but is an unnecessary attribute. A multilevel relation schema corresponds to a collection of statedependent relation instances R , one for each access class c. A relation instance is denoted by R, ( A , , C , , . . ., A , , C , , TC) and consists of a set of distinct tuples of the form ( a , , c,, . . ., a,, c,, tc), where each a, E dom(Ai), c 2 c i , c, E [Li, H , ] , and tc = lub(ci,i = 1 . ~ 1 We . use the notion t [ A , ]to refer to the value of attribute A , in tuple t while t [ C , ] denotes the classification of A , in tuple t . Because of the simple-property of BLP, t [ A is visible for subjects with clear(s) 2 t [ C , ] ;otherwise t [ A , ]is replaced with the null value. The standard relational model is based on two core integrity properties: the key property and the referential integrity property. In order to meet the requirements for MLS databases, both have been adapted and two further properties have been introduced. In the standard relational data model a key is derived by using the concept of functional dependencies. In the MLS relational model such a key is called an apparent key. Its notion has been defined by Jajodia et al. (1990). For the following we assume that

,

16

GUNTHER PERNUL

RS (Al, C , , ...,A,, C,, TC) is an MLS relational schema and that A (A E ( A , ...,A,)) is the attribute set that forms its apparent key. [MLS integrity property 11: Entity integrity. An MLS relation R satisfies entity integrity if and only if for all instances R, and t E R, the following conditions hold: 1. Ai E A =$ t[Ai]# null 2. A i , A j E A * t [ C i ] = t[Cj] 3. A i ct A =$ t[Ci] 2 t[CA] (C, is the classification of key A). Entity integrity states that the apparent key may not have the null value, and must be uniformly classified, and that its classification must be dominated by all the classifications of the other attributes. [MLS integrity property 21: Null integrity. R satisfies null integrity if and only if for each R, for R the following conditions hold: 1. For every t E R,, t[Ai] = null * t[Ci] = t[CA] 2. R, is subsumption free, i.e., it does not contain two distinct tuples such that one subsumes the other. A tuple t subsumes a tuple s, if for = s [ A i ,Ci] or t [ A i ]# null and every attribute A i , either t [ A i yCi] s[Ai]= null. Null integrity states that null values must be classified at the level of the key and that for subjects cleared for higher security classes, null values visible to lower clearances are replaced by the proper values automatically. The next property deals with consistency between the different instances R, of R. The inter-instance property was first defined by Denning et al. (1988) within the SeaView framework, later corrected by Jajodia and Sandhu (1990b) and later again included in SeaView by Lunt et al. (1990). [MLS integrity property 31: Inter-instance integrity. R satisfies the interinstance integrity if for all instances R, of R and all c' < c, a filter function 0 produces R,, . In this case R,, = o(R,,c') must satisfy the following conditions: 1. For every t E R, such that t[C,] Ic' there must be a tuple t' with t ' [ A ,C,] = t [ A ,C,] and for Ai ct A

t'[Ai, Ci]=

I

E R,,

if t[Ci] I c' t [ A i ,Ci] (null, f[CA])otherwise.

2. There are no additional tuples in R,, other than those derived by the above rule. R,, is made subsumption free.

DATABASE SECURITY

17

The inter-instance property is concerned with consistency between relation instances of a multilevel relation R. The filter function ci maps R to different instances R, (one for each c’ < c). Through the use of filtering a user is restricted to that portion of the multilevel relation for which the user is cleared. If c’ dominates some security levels in a tuple but not others, then during query processing, the filter function ci replaces all attribute values the user is not cleared to see by null values. Because of this filter function a shortcoming arises in the Jajodia-Sandhu model which was pointed out by Smith and Winslett (1992). Smith and Winslett state that ci introduces an additional semantics for nulls. In the Jajodia-Sandhu model a null value can now mean “information available but hidden” and this null value cannot be distinguished from a null value representing the semantics, “value exists but not known” or a null value with the meaning “this property will never have a value.” In a database all kinds of nulls may be present and at a certain security level it may be difficult for subjects to say what should be believed at that level. Let us now draw our attention to polyinstantiation. As we have seen in the example given earlier, polyinstantiation may occur in a number of different occasions, for example, when a user with low clearance attempts to insert a tuple that already exists with higher classification, or when a user wishes to change values in a lower classified tuple. Polyinstantiation may also occur because of a deliberate action in the form of a cover story, where lower cleared users should not be supported with the proper values of a certain fact. Some researchers state that the use of polyinstantiation to establish cover stories is a bad idea and should not be permitted. However, if supported, it may not occur within the same access class.

[MLS integrity property 41: Polyinstantiation integrity. R satisfies polyinstantiation integrity if for every R, and each attribute A ; , the functional dependency A C; --* A; (i = l..n) holds. Property 4 states that an apparent key A and the classification of an attribute correspond to one and only one value of the attribute, i.e., polyinstantiation may not occur within a single access class. In many DBMSs supporting a MLS relational data model, multilevel relations exist only at the logical level. In such systems multilevel relations are decomposed into a collection of single-level base relations which are then physically stored in the database. Completely transparent multilevel relations are constructed from these base relations upon user demand. The reasons underlying this approach are mainly practical in nature. First, fragmentation of data based on the sensitivity of the data is a natural and

18

GUNTHER PERNUL

intuitive solution to security and, second, available and well-accepted technology may be used for implementation of MLS systems. In particular, the decomposition approach has the advantage of not requiring extension of underlying trusted computing base (TCB) to include mandatory controls on multilevel relations, which means that the TCB can be implemented with a small amount of code. Moreover, it allows DBMS to run mainly as an untrusted application on top of the TCB. We will come back to this issue in Section 3 in a discussion of different implementations of trusted DBMSs.

2.2.3 MAC-Based Structural Limitations Although more restrictive than DAC models, MAC techniques require certain extensions in order to be applied to databases in an efficient way. In particular, the following drawbacks in multilevel secure databases and mandatory access controls based on BLP represent structural limitations: 0

0

0

Granularity of the Security Object. It is not yet agreed what should be the granularity of labeled data. Proposals range from protecting whole databases, to protecting files, protecting relations, attributes, or even certain attribute values. In any case, careful labeling is necessary since otherwise inconsistent or incomplete label assignments could result. Lack of an Automated Security Labeling Technique. Databases usually contain a large collection of data and serve many users, and in many civil applications the labeled data are not available. This is why manual security labeling is necessary though it may also result in an almost endless process for large databases. Therefore, support techniques are needed, in the form of guidelines and design aids for multilevel databases, tools to help in determining relevant security objects, and tools that suggest clearances and classifications. N-persons Access Rules. Because of information flow policies, higher cleared users are restricted from writing-down on lower classified data items. However, organizational policies may require that certain tasks be carried out by two or more persons (four-eyes principle) having different clearances. As an example, consider subjects sl, s, with clear(s,) > clear(s,), data item d with class(d) = clear@,) and a business rule that specifies that writing s2 on d requires the approval of s1 . Following Bell-LaPadula’s write-access rule it would be necessary for s1 and s2 to have the same level of clearance. This may be inadequate in business applications of MLS database technology.

DATABASE SECURITY

2.3

19

The Adapted Mandatory Access Control Model

The principal goals of the Adapted Mandatory Access Control (AMAC) model are to adapt mandatory access controls to better fit general-purpose data processing practice and to offer a design framework for databases containing sensitive information. In order to overcome the MAC-based limitations discussed earlier, AMA C offers several features that assist the database designer in performing the different activities involved in designing a database containing sensitive information. AMA C has the following advantages when used as a security technique for databases: 0

0

0

0

The technique supports all phases of the database design process and can be used to construct discretionary-protected as well as mandatoryprotected databases. If mandatory protection is required, a supporting policy for the purpose of deriving database fragments as the target of protection is provided. This responds to concerns regarding the granularity of security objects in multilevel systems. If mandatory protection is required, automated security labeling of security objects and subjects is supported. Automated labeling leads to candidate security labels that can be refined by a human security administrator if necessary. This overcomes the limitation that labeled data often is not available. In AMAC security is enforced through the use of database triggers and thus can be fine-tuned to meet application-dependent security requirements. For example, the n-eyes principle may be supported in some applications but not in others where information flow control is a major concern of the security policy.

We will first give a general overview of the AMAC technique followed by a more formal discussion and an example.

2.3. I

AMAC General Overview

Adapted mandatory security belongs to the class of role-based security models which assume that each potential user of the system performs a certain role in the organization. Based on their role users are authorized to execute specific database operations on a predefined set of data. The AMAC model covers not only access control issues; it also includes a database design environment with the principal emphasis on the security of the databases which are produced. These databases may be implemented in DBMSs that support DAC exlusively or in DBMs that support both DAC and MAC. The technique combines well known and widely accepted

20

GUNTHER PERNUL

concepts from the field of data modeling with concepts from the area of data security research. In AMAC the following are the design phases for security-critical databases: 1. Requirements Analysis and Conceptual Design. Based on the role which they perform in the organization potential users of a database may be classified into a number of different groups whose data and security requirements may differ significantly. The Entity-Relationship (ER) model and its variants serve as an almost de facto standard for conceptual database design and have been extended in AMAC to model and describe security requirements. The security and data requirements of each role performed in an organization are described by the individual ER schemas and form the view (perception) which each user group has of the enterprise data. Note that in this setting the notion of a view embraces all the information which a user performing a certain role in an organization is aware of. This information includes data, security requirements, and functions. Thus, the notion of views here is different from its sense in a DAC environment. To arrive at a conceptualization of the whole information system as seen from the viewpoint of the enterprise, AMAC employs view-integration techniques in a further design step. The resulting conceptual database model is described by a single ER schema which is extended by security flags that indicate security requirements entailed by certain user roles. 2. Logical Design. In order to implement the conceptual schema into a DBMS a transformation from the ER schema to the data model supported by the DBMS in use is necessary. AMAC contains general rules and guidelines for the translation of ER schemas into the relational data model. The output of the transformation process is a set of relational schemas, global dependencies that are defined between schemas and are necessary for maintaining database consistency in the further design steps, and a set of views, which now describe the access requirements entailed by the relation schemas. If the DBMS that is to hold the resulting database is capable only of supporting DAC, the relational schemas become candidates for implementation and the view descriptors may be employed as discretionary access controls. If a particular DBMS supports MAC, further design activities are necessary. The Requirements Analysis, Conceptual and Logical Design phases in AMAC are described by Pernul and Tjoa (1991). 3. The AMACSecurity Object. In order to enforce mandatory security it is necessary to decide which security objects and security subjects are both subject to security label assignments. In AMAC a security object is a database fragment and a subject is a view. Fragments are derived using structured database decomposition and views are derived by combining

DATABASE SECURITY

21

these fragments. A fragment is the largest area of the database to which two or more views have access in common. Additionally, no view exists with access to a subset of the fragment only. Pernul and Luef (1991) developed the structured decomposition approach and the automated labeling policy. Their work includes techniques for a lossless decomposition into fragments and algorithms to keep fragmented databases consistent during database update. It should be noted that a database decomposition into disjoint fragments is a natural way of implementing security controls in databases. 4. Support of Automated Security Labeling. As in most applications labeled data is not available, AMAC offers a supporting policy for automated security labeling of security objects and security subjects. Automated labeling is based on the following assumption: The larger the number of users cleared to access a particular fragment, the lower the sensitivity of the data contained in the fragment and, thus, the lower the level of classification with which the fragment has to be provided. This assumption would appear to be valid, inasmuch as a fragment that is accessed by many users will not contain sensitive information and, on the other hand, a fragment that is accessible to only a few users can be classified as highly sensitive. Views (respectively, users having a particular view as their access window to data) are ordered based on the number of fragments they may access (are defined over) and, in addition, based on the classifications assigned to the fragments. In general, a view needs a clearance that allows corresponding users to access all the fragments which the view is defined over. A suggested classification class(F)applies to an entire fragmental schema F as well as all attribute names and type definitions for the schema, while a suggested clearance Clear( V) applies to all transactions executing on behalf of a user V. It should be noted that classifications and clearances are only candidates for security labels and may be refined by a human database designer if necessary. 5 . Security Enforcement. In AMAC fragments are physically stored and access to a fragment may be controlled by a reference monitor. Security is enforced by means of trigger mechanisms. Triggers are hidden rules that can be fired (activated) if a fragment is affected by certain database operations. In databases security-critical operations include the select (read-access), insert, delete, and update (write access) commands. In AMACselect triggers are used to route queries to the proper fragments, insert triggers are responsible for decomposing tuples and inserting corresponding sub-tuples into the proper fragments, and update and delete triggers are responsible for protecting against unauthorized modification by restricting information flow from high to low in cases that could lead to undesired information transfer. The operational semantics of AMAC data base operations and the construction of the select and insert triggers are outlined by Pernul (1992a).

22

GUNTHER PERNUL

2.3.2 Technical Presentation of A MAC. An Example

In AMAC security constraints are handled in the course of database design as well as query processing. In the course of database design they are expressed by the database decomposition while during query processing they are enforced by trigger mechanisms. In the discussion which follows we will give the technical details of the decomposition process, the decomposition itself, the automated security-labeling process, and certain inegrity constraints that have to be considered in order to arrive at a satisfactorily fragmentation. In AMAC it is assumed that the Requirements Analysis is performed on an individual user group basis and that the view which each user group has of the database is represented by an ER model. The ER Model has been extended to cover, besides the data semantics, the access restrictions of the user group, The next design activity is view integration. View integration techniques are well established in conceptual database design and consist in integration of the views of individual user groups into a single conceptual representation of the database. In AMAC the actual integration is based on a traditional approach and consists of two steps: integration of entity types and integration of relationship types (Pernul and Tjoa, 1991). During the integration correspondences between modeling constructs in different views are established and, based on the different possible correspondences, the integration is performed. Following integration the universe of discourse is represented by a single ER diagram extended by the access restrictions for each user group. The next step is to transform the conceptual model into a target data model. AMAC offers general rules for the translation into the relational data model. The translation is quite simple and results into three different types of modeling constructs: relation schemas (entity-relations or ‘relationship-type relations), interrelational dependencies defined between relation schemas, and a set of view descriptors defined on relation schemas and representing security requirements in the form of access restrictions for the different user groups. In the relational data model user views have no conceptual representation. The decomposition and labeling procedure in AMAC is built around the concept of a user view, entailing a simple extension of the relational data model. Let RS(A TTR, LD) be a relational schema with ATTR a set of attributes [ A , , ..., A n ] . Each A i E ATTR has domain dom(Ai). LD is a set of functional dependencies (FDs) restricting the set of theoretically possible instances of a relation R with schema RS (i.e., x i d o m ( A i ) )to the set of semantically meaningful instances. A relation R with schema RS consists in

DATABASE SECURITY

23

a set of distinct instances (tuples) It, , ..., t,) of the form ( a , , ...,a,) where a, is a value within dorn(A,). Let RS,(ATTR, ,LD,) and RS,(A TTR, , LD,) be two relational schemas with corresponding relations R , and R , . Let X and Y denote two attribute sets with X E A TTR, and Y L A TTR, . The interrelational inclusion dependency (ID) R S , [ X ] G RS,[Y] holds if for each tuple t E R , exists at least one tuple t’ E R, and t [ X ] = t ’ [ Y ] .If Y is a key in RS,, the ID is called key-based and Y is said to be a foreign key in RS, . Let V = [ V l , ..., Vp] be a set of views. A view F (F E V, i = l..p) consists of a set of descriptors specified in terms of attributes and a set of conditions on these attributes. The set of attributes spanned by the view can belong to one or more relation schemas. View conditions represent the access restrictions of a particular user group on the underlying base relations. For each user group there must be at least one view. The concepts defined above serve as the basis of the AMAC conceptual start schema SS. SS may be defined by a triple SS(%, GD, V ) , where: %

=

(RSl(ATTR, ,L D , ) , ...,RS,(A TTR,, LD,)) is a set of relational schemas,

GD = (ID,, . ..,ID,) is a set of key-based IDS V

=

( V , , ..., V,) is a set of views

If protection is sufficient, the relational schemas are candidates for implementation in a DBMS, the views may be used to implement contentbased access controls, and the set GD of global dependencies may be associated with an insert rule, a delete rule, and a modification rule in order to ensure referential integrity during database operation. If DAC is not sufficient and MAC has to be supported, it is necessary to decide which are the security objects and subjects and to assign appropriate classifications and clearances. In order to express the security requirements defined by means of the views, a decomposition of SS into single-level fragments is necessary. The decomposition is based on the derived view structure and results in a set of fragmental schemas in such a way that no view is defined over a subset of the resulting schema exclusively. A single classification is assigned to each fragmental schema and the decomposition is performed by means of a vertical, horizontal, or derived horizontal fragmentation policy. A vertical fragmentation (uf) results in a set of vertical fragments (F, , ..., F,) and is the projection of a relation schema RS onto a subset of its attributes. In order that the decomposition be lossless, the key of RS must be included in each vertical fragment. A vertical fragmentation (uf) R = (Fl , . . ., F,) of a relation R is correct if for every tuple t E R , t is the concatenation of ( v ,, ..., v,) with the vi tuple in F, (i = 1 ..r). (uf) is used

24

GUNTHER PERNUL

to express “simple” security constraint that restrict access to certain attributes. The effects of ( o f ) on an existing set of FDs have been studied by Pernul and Luef (1991) who showed that if R is not in 3NF (third normal form), some FDs might get lost in a decomposition. To produce a dependency-preserving decomposition in AMA C, Pernul and Luef suggested including virtual attributes (not visible to any user) and updating clusters in vertical fragments if a schema is not in 3NF. A horizontal fragmentation ( h f ) is a subdivision of a relation R with schema RS(ATTR, LD) into a subset of its tuples based on the evaluation of a predicate defined on RS. The predicate is expressed as a boolean combination of terms, with each term a simple comparison that can be established to be true or false. An attribute on which (hf)is defined is called a selection attribute. A (hf) is correct if every tuple of R is mapped into exactly one resulting fragment. Appending one horizontal fragment to another leads to a further horizontal fragment or to R again. (hf)is used to express access restristrictions based on the content of certain tuples. A derived horizontal fragmentation (dhf) of a relation Ri with schema RSi(ATTRi,LDi) is a partitioning of RSi produced by applying a partitioning criterion defined on RSj (i # j ) . (dhf) is correct if there exists a key-based ID of the form Ri [XIE Rj [ Y ] and each tuple t E Ri is mapped into exactly one of the resulting horizontal fragments. (dhf)may be used to express access restrictions that span several relations. A view 6 (F E V) defined on 3 represents the area of the database which a corresponding user group can access. Let F (F = F n 5 ) be a database fragment then F represents the area of the database to which two groups of users have access in common. If F = F\5, F is accessible only to users having view 6 as their interface to the database. In this case, F represents data which are not contained in 5 and, therefore, must not be accessible to the corresponding user set. From the point of view of a mandatory security policy a certain level of assurance must be given that users 5 are restricted from access to F. In AMAC this is produced by separation. For example, fragment (F\5 )is separated from fragment (F\6 )and fragment (6 n 5 ) even if all the fragments belong to the same relation. The construction of the fragments makes a structured database decomposition necessary. In addition, to support mandatory access controls, the access windows for the users is constructed in a multilevel fashion in which only the necessary fragments are combined to form a particular view. Let Attr( V )be the attribute set spanned by view V and let the subdomain SD(V[A])be the domain of attibute A valid in view V (SD(V[A])G Dom(A)).Two particular views 6 and 5 are said to be overlapping if

I

3 A ( A E A ttr( 6

n 5 ) and SD( F [A]) n SD( 5 [ A ] )# 0.

DATABASE SECURITY

25

Otherwise, and 5 are isolated. The process of decomposing 8 (8 = (RS,(ATTR, ,LD,), ...,RS,(A TTR,, LD,))) is performed for any two overlapping views and for each isolated view using the ( v f ) , (hf), and (dhf) decomposition operations. It results in a fragmentation schema FS = (FS,(attr,,Id,), ...,FS,(attr,, ld,)) and a corresponding set of fragthe ments F (F = (F,, ...,F,)). If Ui A TTRi = Uj attrj (i = 1. .n,j = 1. .m), decomposition is called lossless, and if U j LDi E Uj Idj (i = 1. .n,j = 1. .m), it is said to be dependency preserving. Note that (hf) or (dhf)may result in additional FDs. A fragmental schema FSj E FS is not valid if for any view V (3 c 4 ) ( V * V o 4).Here V * F denotes that users with view V have access to fragment F, while V e F means that F is not included in view V. To illustrate these concepts, we now apply the fragmentation policy to the example given in the Introduction. We assume that a Requirements Analysis has been performed and that the resulting ER model has been translated into the following start schema:

c

c,

SS = (8= (Employee ([SSN, Name, Dep, Salary), [SSN-Name, Dep, Salary)), Project ((Title, Subject, Client), (Title-Subject, Client)), Assignment ((Title, SSN, Date, Function), (Title, SSN-+Date,Function))], GD = (Assignment[Title]E Project[Title], Assignment[SSN] G Employee[SSN]], v = IV,, v2, v,, v4, V,))

The security policy of the organization requires that the following security conditions be represented: 0

0

0 0

View V, represents the access window for management of the particular organization. Users with view V, should have access to the entire database. VieTws V, and V3 represent users of the payroll department. Their requirements include access to Employee and Assignment. For V2 access to Employee is not restricted. However, access to the attribute Function should be provided only if Salary I100 for certain employees. Users V, should have access only to employees and their assignments if Salary I80. View V4 has access to Project. However, access to the attribute Client should not be supported if the subject of a project is “research.” View V5represents the view of users of the quality-control department. In order for these users to perform their duties, they must have access to all information related to projects with subject development i.e.,

26

GUNTHER PERNUL

1

Employee

1

Assienment

I

Project

(b) FIG. 3. Example of AMAC database decomposition. (a) Graphical representation of the view structure. (b) Structural decomposition.

to project data, assignment data, and data concerning assigned employees. Given these types of security requirements, construction of the fragmentation schema in AMAC is warranted. The security constraints fall into three different categories: simple constraints, which define a vertical subset of a relation schema, and content-based or complex constraints, which both define horizontal fragments of data. A (simplified) graphical representation of the corresponding view structure is given in Fig. 3(a). The view structure forms the basis of the decomposition. Because view V, spans the entire database, it does not produce any decomposition. View V, results in a derived horizontal fragmentation (dhf) of Assignment based on evaluation of the predicate p:Salary I100 defined on Employee. The decomposition is valid because of the existence of the key-based inclusion dependency between Employee and Assignment. For those tuples matching

27

DATABASE SECURITY

the condition in the second step, a vertical fragmentation (uf) is performed which splits attribute Function from the other attributes in the derived fragment. In Fig. 3(b) the outcome of this operation is shown as IF,, and IF22.Introducing view V3 results in a horizontal fragmentation (hf) of Employee (into IF3and IF,) and into a (dhf) of IF,. IF, is split into IF,, (assignment data of employees with salary below 80) and IF,, (assignment data of employees having a salary between 81 and 99). Again, this fragmentation is valid because of the existence of the key-based ID Assignment[SSN] S Employee[SSN]. Introducing view V, results in application of (hf) to Project and a further (uf)-decomposition splits attribute Client from projects having as Subject the value “research.” The result of the operations is given in Fig. 3(b) as the fragments F, and F, . Introducing view V, again entails several additional (hf) and (dhf) operations. Starting with “project” (hf) is performed on IF, resulting in F3 (holding projects with subject “development”) and F, (holding all other projects). The next step is (dhf)of Assignment, an operation that is necessary in order to find all assignment data that relates to projects having as subject “development.” Such data may be found in each intermediate fragment derived so far; thus, a total of four different (dhf)operations are necessary. A similar situation occurs with employee data. The final decomposition is given in Fig. 3(b) and consists of 16 different fragments. In order to support MAC it is necessary to determine the security objects and security subjects and to assign appropriate classifications and clearances. In AMAC (semi-) automated security label assignment is supported and based on the following assumption: A fragment accessed by numerous users cannot contain sensitive information, whereas a fragment that is accessed by only a few users may contain sensitive data. In AMAC such an assumption leads to assignment of a “low” classification level to the former type of fragment and a “high” classification to the latter type. On the other hand, views that access a large number of fragments or fragments assigned “high” classifications must have “high” clearances. In general, a view requires a clearance which allows appropriate users access to all fragments which the view is defined over. Let F = {F,, ...,F,J be a set of fragments and V = ( V , , ..., V,) a set of views defined on the database. Let a: F + P(V) be a mapping that assigns to a fragment the set of views having access to this fragment. By P( V)) we denote the cardinality of the set of views accessing card-a(fi the fragment, i.e., Ia(fi)l. Card-a(F, + P(V)) determines the level of classification that fragment & must be provided with. Let d: V P(F) be a mapping which associates with some view the set of fragments spanned by this view. By carddd(5 + P(F)) we denote the cardinality of the set of fragments which a user with view Vj has access to, i.e., ld(5)I. By applying +

+

28

GUNTHER PERNUL

a(F,) and d ( 5 ) to the example discussed earlier, we derive the following mappings: Mappings from fragments to views:

a(F6) = ( v l l ,

= lvll,

I vi KI, W

vi v41, W s ) = I vi VsI, dF8) = (vi v21, dF14 = 1v1 v21, 4F3) = Ivl,V4,v51, dF7) = (vl,V2,v51,dF9) = lvl,v2,~51, a(F12)= IVI * v 2 v3)9a(F13) = 1 v 1 v 2 ~ s I a(F16) , = v1, v 2 P V3I a(F11) = ( v l ,v2, v3, hI, a(Fi5 = 1V1,v2, v3, v51Mappings from views to fragments: @I)

=

9

= ( vl

4 ) 9

= (

9

v219

9

9

9

9

d(V1) = IF] F2 F3 F4 Fs F6 F7 F', F9 F1o F11 F12 Fi3 tFi4 Fi5 F161, 3

3

d(v2)

=

( F 7 , F8

d(v3)

=

lFll

9

9

9

9

F99 F I O , F1l

F12

d( v4) = IF1 F3

3

9

F15

3

3

Fl2, Fl3 F14, FIS F161,

F161,

9

d( VS) = IF3 F5 F7 ,F9 FI 1 ,F13 ,Flsl. Let us now order the fragments based on the assumption we have presented. The ordering defines the level of classification that has to be assigned to a fragment. Based on our assumption, we may derive the following dominance relationship between the classifications (assuming, to simplify the discussion, a uniform distribution of users among views): (c~uss(F~), ClUSS(F6)) > ( c~uss(FI),c~uss(F~), c~uss(F~), c~uss(F~), class(l[;,,), > (c~uss(F~), c~uss(F~), c~uss(F~), c ~ u s s ( F ~c ~ u) ,s s ( F ~C~USS(Fl6)) ~), > ~lass(F14)) lclass(Fll),class(F15)]. Furthermore, clear( V,) 1 (class(F,)),..., class(F16)), CleUr(v2) 1 (c~uss(F~), ...,ClUSS(Fl,)), clear( V3)1 Iclass(F1I), ...,class(F12), class(F15),c~ass(F16)), clear( V,) 1 (cluss(F,), class(F3), cluss(F4)), and clear( V,) 1 class(F3), class(F,), class(F7), class(F9), class(F,1), class(F,3 ) , class(F15)1.The security classifications are assigned based on the ordering of the fragments and are given in Fig. 4. The dominance relationship (d > c > b > a) holds for the levels. Structured decomposition results in the assignment of a classification label to each fragmental schema and a clearance to each user view. Thus, a fragmental schema can be denoted FS(attr, Id, c), which may be understood to mean that data contained in FS is uniformly classified by classification c. The process of structured decomposition and label assignment can be automated. The assigned security labels serve only as a suggestion to a human database designer, who can refine them as necessary. However, it is commonly agreed that if the number of different views is large, automated 9

9

9

29

DATABASE SECURITY

Ro. 4. Example of assigned classifications.

labeling will produce very satisfactory results. The outcome of the decomposition and the assigned classifications and clearances are maintained by three catalog relations: the Dominance Schema, the Data Schema, and the Decomposition Schema. Applied to the sample label assignment, means that based on the AMAC assumptions fragments F6 and F, describe the most sensitive area of the database. This seems a legitimate result, since F6 holds the attribute Function of assignments of employees that earn more than 100 if the assignment refers to a project with attribute Subject # “development” and if F, contains sensitive information stating the clients of projects having as subject “research.” Since only one group of users (V,) has access to both fragments, F, and F6 are assigned a classification that dominates all other classifications of the database and is dominated solely by cleu4Vl). On the other hand, fragments F,, and F,, are accessed by most of the views. The AMAC labeling assumption seems legitimate here, too, because both fragments describe nonsensitive data concerning employees and their assignments if the employee earns less than 80 and if the corresponding project has as subject “development.” In AMAC multilevel relations exist solely at a conceptual level. The access window for the users is constructed in a multilevel fashion so that only necessary fragments are combined to form a particular view. This is done in a way that is entirely transparent to the user, first, by filtering out fragments that dominate a particular user’s clearance and, second, by performing the inverse decomposition operation on the remaining fragments. This for ( v f ) represents a concatenation of vertical fragments (denoted (c)) and, for (hf) and (dhf), an append of horizontal fragments (denoted ( a ) ) . In the example we are considering, view V , and V, on Employee and Assignment can be constructed in the following way ((*) denotes the join operation): V1 V2

(Fl5(a>Fl6)(a)(Fl3(a>F,,)(*) ((Fs(a)F6)(c )(F7(a)FS))F16)(a>(F1J(u>F,,)(*)

(F,(a>F,)(a>(F9ca>F,o)(~)(~l I(&FlZ)

2)

30

GUNTHER PERNUL

The conceptual multilevel relations look different to different users, depending on the view. For example, the relation Assignment consists of IF5, ...,F12]for users V,, of (F7, ...,F12)for users V,, and of [Fl,, F 1 2 ) only for users V, . Three catalog relations are necessary in AMAC in order to maintain the database decomposition, construct the multilevel relations, and control the integrity of the database: the Decomposition Schema, the Data Schema, (a) Dominance Schema

I View I Clear I

Dominates

(b) Data Schema

1

Attribute

Title Function SSN Title Function

Integrity Constraint SSN G F7WNI Title G F3[Titlel SSN C FgPSNI Title G FIITitlel u F,[Titlel

Function Title

Title L F3(Titlel

Function I

.I.

( c ) Decomposition Schema

FIG. 5 . AMAC catalog relations.

F6

DATABASE SECURITY

31

and the Dominance Schema. Figure 5 presents some of the catalog relations that result from decomposition of the sample database. 0

0

0

Decomposition Schema. The schema comprises a mapping of the decomposition structure into a flat table. Its contents are needed to reconstruct multilevel relations from single-level fragments. Dominance Schema. The schema is used to model the allocation from fragments to users. Whenever a user-supplied query attempts to access a multilevel relation, the system has to make certain that the access request does not violate the security policy of the organization. For example, if there is a rule which states that the user’s clearance must dominate the classification of the referenced data, this rule may be complemented using information from the Decomposition Schema and the Dominance Schema. Data Schema. The Data Schema contains the schema definitions of the fragments and the set of integrity conditions which must be valid for every tuple in the fragment. Update operations performed on tuples in horizontal fragments may lead to transfer of tuples to other horizontal fragments. This occurs if the update changes the value of a selection predicate to a value beyond the domain of this attribute in the fragment. If information from the data schema is used, it is always possible to determine the valid domain of the selection attributes in the fragments and to route tuples to the proper fragments in case an update or insert operation is performed.

So far we have shown how the security requirements can be expressed in

AMAC during database design by means of structured decomposition. In

Pernul (1992a) it is shown how these requirements can be enforced during database operation by means of database triggers. Triggers are implemented in most DBMS-products and can be used to perform hidden actions without the user’s knowledge. Generally speaking, a trigger consists of two parts. The first part, the trigger definition, specifies when the trigger should be invoked, while the second part, the trigger action, defines the actions which the trigger is to perform. We see triggers as an alternative way of implementing a security policy. In the following discussion we will specify the simple security property (read access) of BLP by means of a select trigger. Similar triggers have been developed by Pernul(1992a) for the insert statement (write access) and have been outlined for the update and delete statements. In what follows, assume that a user having clearance C has logged on to the system. Based on the information of the Dominance Schema, a set of security classifications (cl, ..., c,) with C 2 ( c , , ..., c,) may be derived. Any process operating on

32

GUNTHER PERNUL

behalf of the user that attempts to access any fragment with schema FS(attr, Id, c’) and c’ ct (q,...,c,] will not be properly authorized and, thus, the corresponding fragments will not be affected by operations performed by the C-level user. Because of security reasons, database fragmentation must be completely transparent to the users and users must be supported with the name of the base relations even if they are authorized to access a subset of a multilevel relation only. Read access is performed by means of a Select statement which has the following form: SELECT attribute list FROM base relations WHERE p Every query contains as parameter the user’s identification and the set of references base relations. Every multilevel base relation has assigned to it triggers which are executed when the base relation is affected by a corresponding operation. As an example consider the definition of a Select trigger as specified below. Here, %X denotes a parameter, the keyword DOWN-TO represents the transitive closure of the base relation (i.e., the set of fragments resulting from a base relation). The trigger implements the simple security property of BLP. CREATE TRIGGER S e l e c t - T r i g g e r ON e a c h - b a s e - r e l a t i o n FOR SELECT A S BEGIN d e c l a r e @ d o m i n a t e s , @ c l a s s i f i c a t i o n SELECT @domi nates=SELECT Domi n a t e s FROM Dominance Schema WHERE View=%V SELECT @ c l a s s i f i c a t i o n = S E L E C T C l a s s From D e c o m p o s i t i o n Schema WHERE P a r e n t = % s p e c i f i e d - b a s e - r e l a t i o n DOWN-TO each-resul t ing-fragment I F @dominatesn@cl a s s i f i c a t i o n 2 0 THEN p e r f o r m q u e r y f o r each e l e m e n t I N ( a d o m i n a t e s n @c l a s s i f i c a t i o n ) ELSE P r i n t ‘Base r e l a t i o n n o t known t o t h e system’ Rollback Transaction END S e l e c t l r i g g e r

As an example consider a user belonging to class V, who wishes to know the names of all the employees and their function is assigned projects. Note that users with view V, should be prevented from accessing data concerning employees who earn more than 80. The user issues the following query: SELECT Name, F u n c t i o n FROM Employee, Assignment WHERE Employee.SSN=Assignment.SSN

DATABASE SECURITY

33

Applied to this example the clearance assigned to users with view V3 @dominates = ( a 1 ,a2,b4,b6J,@classification[d2, d 3 , d 4 , d,, c6, b 2 , b 3 , b4, bS,b6,a , , a2), and @dominates fl @classification = ( a l ,a,, b4,b6]. Thus, the query is automatically routed to the corresponding fragments F,, , F I 2 ,F l s , and F16 and, based on the information of the Decomposition Schema, V3can be constructed by means of the inverse decomposition operation, i.e., V3 (F,, ( a ) F16)(*) (Fll( a ) F12).The outcome of the Select operation is in accordance with the simple security property of BLP. +

2.4 The Personal Knowledge Approach The personal knowledge approach is focused on protecting the privacy of individuals by restricting access to personal information stored in a database or information system. The model serves as the underlying security paradigm of the prototype DBMS Doris (Biskup and Briiggemann, 1991). The main goal of this security technique is to ensure the right of individuals as regards informational self-determination now part of the laws of many countries. In this context, the notion of privacy can be summarized as asserting the basic right of an individual to choose which elements of his or her private life may be disclosed. In the model, all individuals, users as well as security objects, are represented by an encapsulated person-object (in the sense of objectoriented technology). The data part of a person-object corresponds to the individual knowledge of himself or herself and his or her relationship to other persons. The operation part of a person-object corresponds to the possible actions which an individual may perform. The approach is built on the assumption that a person represented in a database has complete knowledge of himself or herself and that if he or she wishes to know something about someone else represented in the database, that person must first be asked. Knowledge of different persons cannot be stored permanently and, therefore, must be requested from the person each time information is requested. In an effort to achieve this lofty goal, the personal knowledge approach developed by Biskup and Briiggemann (1988, 1989) combines techniques of relational databases, object-oriented programming, and capability-based operating systems. More technically, it is based on the following constructs: Persons. A person-object represents either information concerning an individual about whom data is stored in the information system or represents the actual users of the system. Each person is an instance of a class, called group. Groups form a hierarchy and, in accordance with object-oriented concepts, a member of a group has the components of the group

34

GUNTHER PERNUL

as well as inherited components from all its supergroups. More technically, an individual person-object is represented by an NF2-tuple (con-firstgormal-form, i.e., it may have nonatomic attribute values) with entries of the following form: t (Surrogate, Knows, Acquainted, Alive, Available, Remembers) where 0 0

0 0

0 0

Surrogate is a unique identifier which is secretly created by the system Knows is application dependent and organized as a relation with set of attributes ( A , , ...,A,,]; it represents the personal knowledge of the person-object Acquainted is a set of surrogates representing other person-objects of which the person is aware Alive is a boolean value Available contains the set of rights which the person has made available to others Remembers contains a set of records describing messages which have been sent or received

Each person is represented as an instance of a group. All persons in a group have the same attributes, operations, roles, and authorities. The operation part of an object consists of system-defined operations which are assigned to groups. Examples of common system-defined operations are ‘create’ (creates a new instance of a group); ‘tell’ (returns the value of the attribute Knows); ‘insert’, ‘delete’, and ‘modify’ (transform Knows); ‘acquainted’ (returns the value for Acquainted), and others.

Communication between Acquainted Objects. Persons are acquainted with other persons. A person individually receives his or her acquaintances by using the operation ‘grant’. The set of acquaintances of a person describes the environment of this person and denotes the set of objects which the person is allowed to communicate with. Communication is performed by means of messages that may be sent from a person to his or her acquaintances in order to query their personal knowledge or to ask that an operation be performed, for example, to that knowledge be updated. Roles and Authorities. Depending on the authority of the sender, the receiver of a message may react in different ways. The authority of a person with respect to an acquaintance is based on the role which the person is currently performing. While the set of acquaintances of a person may change dynamically, authorities and roles are statically declared in the system. When a person-object is created as an instance of a group, it receives the authorities declared in this group and in all its supergroups.

DATABASE SECURITY

35

Auditing. Each person remembers the messages the person is sending or receiving. This is established by adding all information about recent queries and updates together with the authorities available at that time to the ‘knowledge’ (attribute Remembers) of the sender and receiver personobject. Based on this information, auditing can be performed and all transactions traced by just ‘asking’ the affected person. Security (privacy) enforcement following the personal-knowledge approach is based on two independent features. First, following login each user is assigned as instance a person-object type and, thus, assumes individually received acquaintances and statically assigned authorities as roles. Second, whenever a user executes a query or an update operation, the corresponding transaction is automatically modified in such a way that resulting messages are sent only to the acquaintances of the person. Summarizing, the personal-knowledge approach is fine-tuned to meet the requirements of informational self-determination. Thus, it is the preferable approach as the underlying security paradigm for database applications in which information about individuals which is not available to the public is maintained, for example, in hospital information systems or databases containing census data.

2.5 The Clark and Wilson Model This model was first summarized and compared to MAC by Clark and Wilson (1987), who claimed that their model was based on concepts already well established in the pencil-and-paper office world. These concepts include the notion of security subjects, (constraint) security objects, a set of well-formed transactions, and the principle of separation of duty responses. If we transfer these principles to the database and security world, they assume the following interpretation: Users of a system are restricted to execute on solely of a certain set of transactions which are permitted to them, and each transaction operates solely on an assigned set of data objects. More precisely, the Clark and Wilson approach may be interpreted in the following way: 1. Security subjects are assigned to roles. Based on the role which they play in an organization, users have to perform certain functions. Each business role is mapped into database functions and, ideally, at a given time, a particular user is playing only one role. A database function corresponds to a set of (well-formed) transactions that are necessary for users acting in a particular role. In this model it is essential to state which user is acting in which role at what time and, for each role, what transactions have to be carried out. To control against unauthorized disclosure

36

GUNTHER PERNUL

and modification of data, Clark and Wilson proposed that access be permitted only through execution of certain programs and well-formed transactions, and that the rights of users to execute such code be restricted based on the particular role a user is acting. 2. Well-formed transactions. A well-formed transaction operates on an assigned set of data. It is necessary to ensure that all the relevant security and integrity properties are satisfied. In addition, a well-formed transaction must provide logging and atomicity as well as serializability of the resulting subtransactions in such a way as to enable the construction of concurrency and recovery mechanisms. It is important to note that, in this model, data items referenced by transactions are not specified by the user implementing the transaction. Rather, data items are assigned depending on the role which the user is enacting. Thus, the model does not allow ad hoc database queries. 3. Separation of duty. This principle requires assigning to each set of users a specific set of responsibilities based on the role the user enacts in the organization. The only way for a user to access data in the database is through an assigned set of well-formed transactions specific to the role which the particular user enacts. In those cases in which a user requires additional information, another user (cleared at a higher level) acting in a separate role must implement a well-formed transaction from the transaction domain of the role he is enacting in order to grant the initial user temporary permission to execute a larger set of well-formed transactions. Moreover, the roles have to be defined in such a way as to make it impossible for a single user to violate the integrity of the system. For example, the design, implementation, and maintenance of a well-formed transaction must be assigned to a different role than execution of the transaction. A first attempt to implement the concept of a well-formed transaction was that of Thomsen and Haigh (1990). The authors compared the effectiveness of two mechanisms for implementing well-formed transactions, Lock-type enforcement (see Subsection 3.2) and the Unix s e t u id mechanisms. With type enforcement, accesses of user processes to data can be restricted based on the domain of the process and the type of data. s e t u i d and s e t g id features allow a user who is not owner of a file to execute commands in the file with the owner’s permission. Although Thomsen and Haigh concluded that both mechanisms are suitable for implementing the Clark and Wilson concept of a well-formed transaction, no further studies or implementation projects are known. The Clark and Wilson model has drawn considerable interest in recent years. However, though it seems quite promising at first glance, it is still

DATABASE SECURITY

37

lacking, we believe, detailed and thorough investigation. In particular, the only potential threats to the security of a system which were addressed were penetration of data by authorized users, unauthorized actions by authorized users, and the abuse of privileges by authorized users. As noted early in our discussion, this represents only a subset of the required functionality of the mandatory security features of a DBMS.

2.6

A Final Note on Database Security Models

In this section we have discussed different approaches towards the representation of database security. In concluding the section, we wish to note that although the models differ significantly, all of the approaches which we have discussed have their own raison d’6tre. The discretionary security approach may be the first choice if a high degree of security is not necessary. Keeping the responsibility to enforce security on the user’s side is sufficient only if potential threats against security would not result in great damage. Even if a central authority is responsible for granting and revoking authorizations, DAC-based protection may still be subject to Trojan horse attacks and cannot be recommended as a security technique in security-critical database applications. Mandatory policies are more effective as they entail users not having control over the creation and alteration of security parameters. In addition, a security policy suitable to a particular application may have both a mandatory and a discretionary component. Note, too, that real systems often allow for leaks on strict mandatory controls, for example, to privileged users, such as system administrators and security officers. Such back-door entry points often represent a serious source of vulnerability. Multilevel applications may become very complex. One way of countering this complexity would be to develop a conceptual representation of a multilevel database application. We will come back to this issue in Section 4, where a conceptual model for multilevel database security is introduced. Although very effective, mandatory policies can only be applied in environments where labeled information is available. We believe this is one of the strongest points in favor of the AMAC security model. AMAC offers a design environment for databases with principal emphasis on security. It includes discretionary as well as mandatory controls. However, the model suffers from a limited level of expressiveness. AMAC uses relational algebra to express security constraints which, for certain applications, may not be sufficiently expressive to specify sophisticated security constraints. We interpret the personal knowledge approach as a means of implementing discretionary controls. Permitting person-objects to decide whether to

38

GUNTHER PERNUL

respond to a query issued by another object seems to be a very effective way of maintaining the privacy of stored information. Privacy security may be an interesting alternative in applications where mainly personal information is maintained, for example in hospital information systems. The Clark and Wilson model has gained wide acceptance in recent years. Although at first glance it seems promising it is our belief that there is still a need for a detailed and thorough investigation because a number of major questions remain open. Many security-relevant actions are relegated to application programs; moreover, the model does not support ad hoc database queries. While we believe that most of the database security requirements could be expressed, this, however, would entail tremendous application development costs.

3. Multilevel Secure Prototypes and Systems Trusted systems are systems for which convincing arguments or proofs have been given to the effect that the security mechanisms are working as prescribed and cannot be subverted. A basic property of trusted systems is their size; these systems tend to be quite large in terms of the amount of code needed for their implementation. This is especially true of complex systems, for example, trusted database managements systems. A complete formal implementation proof of system specifications is still not possible using present-day technology, although a great deal of research on formal specification and verification is currently in progress. The enormous amount of code necessary is the reason for the very conservative approach taken by most trusted DBMSs in an effort to achieve a certain level of assurance through reuse and by building upon previously built and verified trusted system, in an approach known as TCB subsetting. A trusted computing base (TCB) refers to that part of a system which is responsible for enforcing a security policy; it may involve any combination of hardware, firmware, and operating system software. The term was defined in the Trusted Computer System Evaluation Criteria (TCSEC, 1985). The criteria defines seven levels of trust, which range from systems that have minimal protection features to those that provide the highest level of security which state-of-the-art security techniques may produce. TCSEC is not the only proposal put forward for the purpose of defining objective guidelines upon which security evaluations of systems may be based. We will review TCSEC and other proposals in Section 5 . TCB subsetting has been identified as a strategy for building trusted DBMSs in the Trusted Database Interpretation (TDI, 1990) of TCSEC. In this section we will discuss the most prominent projects which have had as

DATABASE SECURITY

39

their goal the design of systems that meet the requirements of the higher levels of trust as specified in TDI evaluation criteria. In order to obtain evaluation at higher levels of trust, a system must be supported by mandatory access controls. There have been three main efforts at designing and implementing trusted relational database systems, SeaView, which has been implemented at SRI; LDV in the Honeywell SCTC; and ASD at TRW. Besides these (semi-) academic prototypes, several vendors, including Ingres, Informix, Oracle, Sybase, Trudata, and others, have announced or already released commercial systems that support mandatory access controls. The systems differ not only in details, and, in addition, there is not even agreement as to what should be the granularity of the security object. For example, SeaView supports labeling at an individual attribute value level, LDV supports tuple-level labeling, and in ASD-Views the security object is a materialized view. Some commercial systems, moreover, enable support security labeling exclusively at the relation level or even the database level.

3.1 SeaView The most ambitious and exciting proposal aimed at the development of a trusted DBMS has come from the SeaView project (see Denning et al., 1987, or Lunt, 1990). The project was begun in 1987 and is a joint effort by Stanford Research Institute (SRI) International, Oracle, and Gemini Computers with the goal of designing and prototyping a multilevel secure relational DBMS. The most significant contribution of SeaView lies in the realization that multilevel relations must exist solely at a logical level and, moreover, may be decomposed into single-level base relations. These finding have a mainly practical import. In particular, single-level base relations can be stored using a conventional DBMS, while commercially available TCBs can be used to enforce mandatory controls with respect to single-level fragments. The architectural approach taken by the SeaView project was intended to implement the entire DBMS on top of the commercially available Gemsos TCB (Schell et ai.,1985). Gemsos provides user identification and authentication, maintenance of tables containing clearances, as well as a trusted interface for privileged security administrators. Multilevel relations are implemented as views over single-level relations. The single-level relations are transparent to the users and stored by means of the storage manager of an Oracle DBMS engine. From the viewpoint of Gemsos, every single-level relation is a Gemsos security object belonging to a certain access class. Gemsos enforces the mandatory security policy based on the Bell-LaPadula security paradigm. A label comparison is performed whenever a subject

40

GUNTHER PERNUL

attempts to bring a storage object into its address space. A subject is prevented from accessing storage objects not in the subject’s current address space by means of hardware controls that are included in Gemsos. In addition to mandatory controls, the SeaView security policy requires that no user be given access to information unless that user has been granted discretionary authorization to this information. DAC-based protection is performed outside Gemsos and allows users to specify which users and groups have authorization to specific modes of access to particular database objects, as well as which users and groups are explicitly denied authorization to particular database objects. Since a multilevel relation is stored as a set of single-level fragments, two algorithms are necessary: 1. A decomposition algorithm to break down multilevel relations into single-level fragments. 2. A recoveryformula to reconstruct an original multilevel relation from fragments. It is obvious that a recovery must yield identical results, otherwise the process of decomposition and recovery is incorrect. In SeaView, decomposition of multilevel relations into single-level relations is performed by means of vertical and horizontal fragmentation while recovery by performing union and join operations. For the following consider a conceptual multilevel relation R (A 1 , C , , ...,A , , C, ,TC) where each Ai is an attribute defined over a domain Di and each Ci a security class from a list (TS, S, Co, U),where TS > S > Co > U.We assume A l is the apparent primary key. The original SeaView decomposition algorithm (Denning et al., 1988) consists of three steps and can be outlined as follows: Step 1. The multilevel relation R is vertically partitioned into n projections R [A ,Cll,R2[A ,C1,A,, GI, ...,R,[A ,C1,A , ,GI. Step 2. Each Ri is horizontally fragmented into a single resulting relation for each security level. Obviously, for (TS, S, CoyU )this results in 4n relations. Step 3 . In a further horizontal fragmentation R,, ..., R, (Le., 4n - 4 relations) are further decomposed into at most four resulting relations. The final decomposition is necessary in order to support polyinstantiation. For this algorithm a performance study and worst-case analysis was performed by Jajodia and Mukkamala (1991) which demonstrated that a multilevel relation R (A,, C,, ..., A,, C,,TC) decomposes into a maximum of (10n - 6) single-level relations.

DATABASE SECURITY

41

The algorithm was subjected to extensive discussion in the scientific literature. Jajodia and Sandhu (1990b) pointed out that it leads to unnecessary single-level fragments. Moreover, performing a recovery of multilevel relations entails repeating joins that may lead to spurious tuples. As an alternative they proposed changing the polyinstantiation integrity property defined in the original SeaView data model by dropping the portion of the property that enforces multivalued dependency. Their suggestions led to a reformulation of the polyinstantiation integrity by Lunt et al. (1990). In a further proposal, Jajodia and Sandhu (1991b) presented a second algorithm that decomposes a multilevel relation into single-level fragments together with a new recovery algorithm which reconstructs an original multilevel relation. The recovery algorithm in this proposal improves earlier versions because, now, decomposition uses only horizontal fragmentation. Since no vertical fragmentations are required, it is possible to reconstruct a multilevel relation without having to perform costly join operations; only unions have to be processed. Recently, Cuppens and Yazdanian (1992) proposed a “natural” decomposition of multilevel relations based on a study of functional dependencies and an application of normalization whenever a decomposition of multilevel relations is attempted. As decomposition and recovery is crucial for SeaView performance it is expected that the subject of efficient decomposition techniques for fragmentation of multilevel relations into single-level fragments will remain a heavily discussed research topic in the future. A further contribution of SeaView was the development of a multilevel SQL (MSQL) database language (Lunt et al., 1988). MSQL is an extension of SQL (Structured Query Language) and includes user commands for operating on multilevel relations. The design includes a preprocessor that accepts multilevel queries and translates the queries into single-level standard SQL queries operating on decomposed single-level fragments.

3.2 Lock Data Views Lock Data Views (LDV) is a multilevel secure relational DBMS, hosted on the Lock TCB and currently prototyped at the Honeywell Secure Computing Technology Center (SCTC) and MITRE. Lock supports a discretionary as well as mandatory security policy. The mandatory policy enforces the simple security property and the restricted *-property of BLP. The authors of LDV have stated that, because of its operating system orientation, the Lock security policy had to be extended for use in LDV (Stachour and Thuraisingham, 1990). One aspect of Lock-type enforcement-is of special interest for the increased functionality of this TCB in LDV.

42

GUNTHER PERNUL

The general concept of type enforcement in Lock and its use in LDV has been discussed by Haigh et ul. (1990). The main idea is that a subject’s access to an object is restricted by the role he or she is performing in the system. This is done by assigning a domain attribute to each subject and a type attribute to each object, both of which are maintained within TCB. Entries in the domain definition table correspond to a domain of a subject and to a type list representing the set of access privileges which this subject possesses within the domain. The type enforcement mechanism of Lock made it possible to encapsulate LDV in a protected subsystem by declaring database objects to be special Lock types (Lock files) accessible only to subjects executing in the DBMS domain. Since only DBMS programs are allowed to execute in this domain, only DBMS processes can access Lock types holding portions of the database. The remaining problem that had to be solved was to enable secure release of data from the DBMS domain to the user domain. Fortunately, Lock supports implementation of assured pipelines that have been used in LDV to transfer data between DBMS and user domains. Assurance is achieved through appropriate trusted import and export filters (hardware and software devices). Two basic extensions to the Lock security policy have been implemented in LDV. Both extensions concern proper classification of data. The first extension relates to insert and update of data. In the course of insert and update, data are assigned to the Lock type which is classified at the lowest level at which the tuple can be stored securely. The second extension is concerned with query results. The result of a query is transferred from Lock types into ordinary objects and the appropriate security level of the query result is derived. The two policies are enforced in LDV by means of three assured pipelines, the queryhesponse pipeline, the datahnput pipeline, and the database definitiodmetadata pipeline. The query/response pipeline is the query processor of LDV. It consists of a set of processes which execute multi-user retrieval requests, integrate data from different Lock types, and output information at an appropriate security level. A user-supplied query is first mapped from the application domain into the DBMS domain, the query is then processed, and the result is labeled, and, finally, exported to the user. To prevent logical inference over time, the response pipeline includes a history function. This mechanism can be used to trace queries already performed for a particular user and to deny access to relations based on the querying history of the user. The duta/input pipeline is responsible for actions that have to be taken whenever a user issues an insert, modify, or delete operation. The request must first be mapped from the application domain to the DBMS domain. The request must then be processed. A delete request will affect only data

DATABASE SECURITY

43

at a single classification level (restricted *-property of BLP). For consistency reasons, data are not actually removed but only labeled as deleted. Before the actual removal takes place certain consistency checks are performed. More complicated is the case in which the request involves an insert operation. Classification rules that may be present in the data dictionary (see discussion of database definitiodmetadata pipeline) may make it necessary to decompose a relation tuple into different subtuples, which are then stored in separated files, each with a different classification. A modify request is implemented in a way similar to the insert operation. The database defiinitiodmetadata pipeline interacts with the LDV data dictionary and is used to create, delete, and maintain metadata. Metadata either correspond to definitions of the database structure (relations, views, attributes, domains) or are classification constraints. Classification constraints are rules that are responsible for assigning proper classification levels to data. The use of the metadata pipeline is restricted to the database administrator or database security officer (DBSSO). Here, again, Locktype enforcement mechanisms are used to isolate metadata in files that can be accessed only by the DBMS domain and the DBSSO domain and not by the application domain. A few final words on the organization of a LDV database. Data are distributed across Lock files and the basic schema is to assign a single set of files to each security level. The data/input pipeline determines the appropriate assignment of data to files through examination of classification constraints stored in the data dictionary. In LDV there is no replication of data across different security levels. The advantage of this approach lies in the simplicity of updates. However, the approach suffers from the disadvantage of a significant performance penalty for retrieval requests due to the need for a recovery algorithm. The recovery algorithm used in LDV is outlined by Stachour and Thuraisingham (1990).

3.3 ASD-Views ASD-Views, implemented on top of an existing DBMS called ASD, is a research project at TRW. ASD is a multilevel relational system offering classification at the tuple level. In 1988 attempts were begun at TRW to extend ASD and to choose views as the objects of mandatory as well as discretionary security. Wilson (1988) discussed the advantages and disadvantages of views as the target of protection within ASD-Views. Among the advantages he stated the following: 0

Views are very flexible and can be used to define access control based on the content of the data.

44 0

0 0

0

GUNTHER PERNUL

The view definition itself documents the criteria used to determine the classification of data. Arithmetic and aggregate functions could be used to define views. Tuple-level classification can be achieved by specifying horizontal views, while attribute-level classification by specifying vertical subsets of relations. Access control lists can be associated with views and can control discretionary access. Thus, the same concept could be used for mandatory and discretionary protection.

However, there are also certain major disadvantages in using views for mandatory protection, two of which are as follows: 0

0

The view definitions may need to be considered within TCB. Viewbased DBMSs tend to be very large, since views are responsible for most of the code of DBMS. Since a small TCB is required for successful evaluation of the correctness of the specifications and the code, including maintenance of views within TCB would represent a tremendous improvement in the verification effort. Not all data are updateable through certain views.

To overcome the disadvantages, Garvey and Wu (1988) included in a near-term design of ASD-Views the claim that each view must include a candidate key of the underlying base relation and, moreover, the near-term design should support only a restricted query language in order to define secure views. ASD-Views was restricted so that, for example, a view definition may describe a subset of data from a single base relation only, while joins, aggregate functions, and arithmetic expressions are not allowed. The authors of ASD-Views argue that these restrictions minimized TCB code considerably. In ASD-Views the restricted views are the security objects and base tables can only be accessed through views. In ASD-Views the creation of a view must be trusted since otherwise a Trojan horse in untrusted code could switch the names of two columns causing data at a higher security level to become visible to a user logged in at a lower level. During database initialization a trusted database administrator creates all the tables and their associated views and assigns a classification level to each view. When a user logs in to ASD-Views a user process is created at the user’s login clearance and discretionary and mandatory access checks on the referenced views can be performed. Because ASD-Views is built on top of ASD, the system may operate in all three different modes of operation of ASD (Hinke et al., 1992). In the first mode of operation, DBMS is a server in a local-area network. In the second mode of operation, the system serves as a back-end DBMS for single-level

DATABASE SECURITY

45

or multilevel host computers. In the final mode of operation, the system serves as a host-resident DBMS within a multilevel host running a multilevel secure operating system.

4.

Conceptual Data Model for Multilevel Security

Designing a database is a complex and time-consuming task, even more so in the case when attention must also be given to the security of the resulting database. Database design, including the design of databases containing sensitive data, is normally done in a process consisting of at least three main design phases (Fugini, 1988). The first phase, conceptual design, produces a high-level, abstract representation of the database application. The second phase, called logical design, translates this representation into specifications that can be implemented using a DBMS. The third phase, or physical design, determines the physical requirements for efficient processing of database operations. Conceptual and logical design can be performed independently of the choice of a particular DBMS, whereas physical design is strongly system dependent. In this section we will develop a conceptual data model for multilevel security. Such a data model is of particular importance to a security administrator who wishes to get a clear understanding of the security semantics of the database application. The model proposed combines wellaccepted technology from the field of semantic data modeling with multilevel security. We will start by identifying the basic requirements of a conceptual data model. The following characteristics of a conceptual database model have been discussed in the literature (see Elmasri and Navathe (1989) or Navathe and Pernul (1 992)): 0

0

0 0

Expressiveness. The data model must be powerful enough to point out common distinctions between different types of data, relationships, and constraints. Moreover, the model must offer a toolset to describe the entire set of application-dependent semantics. Simplicity. The model should be simple enough for a typical user or end user to understand and should, therefore, possess a diagrammatic representation. Minimality. The model should comprise only a small number of basic concepts. Concepts must not be overlapping in meaning. Formality. The concepts of the model should be formally defined and should be correct. Thus, a conceptual schema can be seen as a formal unambiguous abstraction of reality.

46

GUNTHER PERNUL

Semantic data models address these requirements and provide constructs which represent the semantics of the application domain correctly. In the proposed approach to the construction of a semantic data model for security we use Chen’s Entity-Relationship (ER) model with enhancements needed for multilevel security. The decision to choose ER is motivated by the fact that this model is extensively used in many database design methodologies, possesses an effective graphical representation, and is a de facto standard of most tools which support database design. We will not discuss aspects related to data semantics, though we will describe in detail application-dependent security semantics which have to be considered in a conceptual data model for multilevel security. For details on the ER approach and questions related to the conceptual database design the reader is referred to Batini et af. (1992). Compared to the enormous amount of published literature on semantic modeling and the conceptual design of databases, not much work has been done in investigating the security semantics of multilevel secure database applications. Only recently have there been studies aimed at providing tools and assistance to help the designer working on a multilevel database application. The first attempts to use a conceptual model to represent security semantics were those of G. W. Smith (1990a, 1990b). G . W. Smith developed a semantic data model for security (SDMS) based on a conceptual database model and a constraint language. It was a careful and promising first step which has influenced all succeeding approaches. More recent efforts have been attempted as part of the SPEAR project (Wiseman, 1991 and Sell, 1992). SPEAR is a high-level data model that resembles the ER approach. It consists of an informal description of the application domain and of a mathematical specification which employs a formal specification language. Two further related projects are known, both of which attempt to include dynamics, in addition to modeling the static of the application as part of the conceptual modeling process. In Burns (1992) the ER Model was extended to capture limited behavior by including the operations ‘create’, ‘find’, and ‘link’ into the conceptual database representation, whereas in Pernul (1992b) ER was used to model the static part of an MLS application while data-flow diagramming was used to model the behavior of the system. The discussion in the following subsection partly adopts the graphical notation developed in Pernul (1 992b). The proposal made in the present section considerably extends previous work on security semantics. In particular, 0

it carefully defines the major security semantics that have to be expressed in the design of a multilevel application

DATABASE SECURITY

0 0 0

0

47

it outlines a security-constraints language (SCL) to express the corresponding rules in a conceptual model of the application it provides a graphical notion for constraints expressed in the ER model it gives general rules to detect conflicting constraints it suggests implementation of the constraint system in a rule-based system so as to achieve completeness and consistency of the security semantics.

4.1

Concepts of Security Semantics

The notion of security semantics embraces all security-relevant knowledge about the application domain. It is concerned mainly with the secrecy and privacy aspect of information (maintaining confidentiality against risk of disclosure) and with the integrity aspect of information (assuring that data is not corrupted). Within the framework of multilevel security, security semantics consists basically of rules (security constraints) classfying both data and query results. The rules are specified by the database designer and must correctly represent the level of sensitivity of classified data. In considering security semantics, certain concepts deserve special attention as regards the classification constraints: 0

0

0

0

0

Identifier. A property which uniquely identifies an object of the real world is called its key or identifier. In security semantics there is also the notion of a near-key, a property that identifies a particular object not uniquely but most of the time. For example, the SSN of an employee is a key while the property Name is a near-key. Content. The sensitivity of an object of a certain type is usually dependent on its content, i.e., actual data values or associations of data with metadata serve to classify an object. Concealing Existence. In security-critical applications it may be necessary to conceal the very existence of classified data, i.e., it is not sufficient to provide unauthorized users with null values of certain facts. Attribute-Attribute Value. Most data make sense only when combined with metadata. As a result, in referring to a classified property, it is understood that both the property and its value are classified. Nonconflicting Constraint Set. For large applications it may be necessary to express a large set of security constraints at the conceptual database level. Verifying the consistency of specified constraints is one of the more difficult tasks. In the approach we have proposed there is

48

0

GUNTHER PERNUL

a distinction between two types of conflicts. Depending on the type, a conflict may be resolved automatically or may be designer notified, and a suitable resolution strategy then decided upon by the designer. Default Security Level. A set of classification constraints is complete if every piece of data has assigned to it a classification level via the classification constraints. In our approach completeness is enforced by ensuring that every piece of data has a default classification. The security level public cannot be assigned explicitly and instead is used as an initial classification in order to ensure completeness. If there are no further classification rules applicable for certain data, public has the semantic meaning that the data are not classified at all.

In the following discussion we present a taxonomy of security semantics consisting of the most common application-dependent requirements on multilevel security. Each requirement is formally defined, expressed in a security-constraint language (SCL), included explicitly in the notion of the ER model, and explicated by means of an example. We start with the basic concepts. An object type 0 is a semantic real-world concept that is described by certain properties. Using ER terminology, 0 might be an entity type, a specialization type, a generic object, or a relationship type. In security terminology, 0 is the target of protection and might be denoted O(A ...,A,,). A , (i = 1..n) is a characteristic property defined over a domain D i . Each security object must possess an identifying property A ( A C ( A ,...,A , ] ) which distinguishes instances (occurrences) u of 0 (0 = ( a l ,..., u,], ai E 0,) from others. Moving to a multilevel world the major question now is to decide how to assign the properties and occurrences of 0 to the correct security classifications. The process of assigning data items to security classifications is called classifyingand results into the transformation of a security object 0 into a multilevel security object W (0 =+ W ) .The transformation is performed by means of the security constraints. In the following we assume W is a flat table as in the definition of an MLS relation in the Jajodia-Sandhu model introduced in Subsection 2.2.2. Figure 6 contains graphical extensions which have been proposed for the Entity-Relationship model. Though very simple these extensions offer a powerful tool for representing very complex application-dependent security constraints. They are stated in terms of sentivity levels, ranges of sensitivity levels, security dependencies, predicates, and association-, aggregation-, and inference constraints. For the sake of simplicity, we distinguish only four different levels of sensitivity. If a finer granularity is required, the model can easily be extended to capture additional levels. A sensitivity level

DATABASE SECURITY

49

Secrecy Levels Ranges of Secrecy Levels Association leading to S (NK .. near-key attribute:

Aggregation leading to T5 (N .. constant) Inference leading to Co Security dependency Evaluation of predicate P

I

[rJ.SI

[ Co..TS]

-@-0-

Y +3FIG. 6 . Graphical extensions to ER.

may be assigned to any structural concept of the ER model. If the occurrences of a security object are not uniformly labeled, a valid range of classifcations is indicated by placing corresponding abbreviations next to the concept. In this case the concept itself must show a level that is dominated by all classifications of the instances or properties of the security object. The concept of a security dependency is introduced to indicate the origin of a classification. Predicates are included to express constraints that are dependent on the content of the security objects. Predicates cannot be specified in the diagrammatic representation and are instead expressed by means of the security-constraint language SCL. Other graphical extensions will be discussed when introducing the corresponding classification constraints. The model we are proposing distinguishes between two types of security constraints, application-independent and application-dependent constraints. Application-independent constraints must be valid in every multilevel database, whereas application-dependent constraints are specified by the database designer. By following the proposed methodology the design of a multilevel database application becomes a two-phase activity. In a first design phase the designer specifies the application-dependent security requirements using ER modeling techniques together with SCL. In the

50

GUNTHER PERNUL

second phase the constraints are analyzed, inasmuch as the specified constraints may conflict with other constraints or may violate applicationindependent rules. In the semantic data model for multilevel security we are proposing, the final design step involves checking the constraints for conflicts, resolving conflicting constraints, and applying the nonconflicting constraint set to construct a conceptual representation of the multilevel application. Consistency and conflict management are discussed in Subsection 4.3 in more detail.

4.2 Classification Constraints In the following discussion we present a taxonomy of the most relevant security semantics that have to be expressed in a conceptual data model. These constraints were initially defined by Pernul et al. (1993). Two types of application-dependent classification constraints are distinguished: (a) constraints that classify the characteristic properties of security objects (simple, content-based, complex, and level-based constraints), and (b) constraints that classify retrieval results (association-based, inference, and aggregation constraints). The examples which we will consider focus on the Project-Employee database given in the Introduction. We assume the existence of a single category only and a list SL of four sensitivity levels, denoted SL = (TS,S, CoyU).Note that the default level public is not in SL and, therefore, may not be assigned except for initializing.

4.2.1 Simple Constraints Simple constraints classify certain characteristic properties of the security objects, for example, the characteristic property that employees have a salary (i.e., classifying property Salary) or the fact that employees are assigned to projects.

FIG. 7. Graphical representation of simple constraint.

DATABASE SECURITY

51

Definition. Let X be the set of characteristic properties of security object 0 (XC ( A , ...,A,]). A simple security property S i c is a classification of the form S i c ( O ( X ) )= C, (C E SL), and results in a multilevel object 0" ( A , , C , , ..., A , , C,, TC), where Ci= C for all A iE X , Ciis not changed if Ai e X . SCL predicate. S i c (0,X , C ) , where 0 is the security object under consideration, X the set of characteristic properties to be classified and C the desired security level. Example and graphical representation. The property function of Assignment is regarded as confidential information. S i c (Assignment, (Function), S )

4.2.2 Content-Based Constraints Content-based constraints classify characteristic properties of the security objects based on the evaluation of a predicate defined on specific properties of this object. Definition. Let Ai be a characteristic property of security object 0 with domain Di,P a predicate defined on A i , and X E ( A , , ...,A,). A content-based constraint CbC is a security classification of the form CbC ( o ( x ) , P : A i e a ) =c o r c ~ c ( o ( x ) , P : A ~ B A ~c) =(eEi=,#,<,>,I,z), a E Di, i # j , C E SL). A predicate may be combined with other predicates by means of logical operators. For any instance o of security object O(A, , .. ., A,) for which a predicate evaluates true, a transformation to o(a,,c, , ...,a,, c, , tc) is performed. Classifications are assigned in such a way that ci = C if A iE X , ciotherwise not changed. SCL predicate. CBC (0,X , A , 8, V , C), where 0 is the security object under consideration, X the set of characteristic properties to be classified, A the evaluated characteristic property A i , B the comparison operator, V the comparison value a or characteristic property A j , and C the security level desired. Example and graphical representation. Properties SSN and Name of employees with a salary L 100 are treated as confidential information. CbC (Employee, (SSN, Name), Salary, 'L', 'loo', Co)

52

GUNTHER PERNUL

unctioi ubject alary

Assignment

FIG. 8. Graphical representation of content-based constraint.

4.2.3 Complex Constraints Complex security constraints relate to two different security objects participating in a dependency relationship. They are treated like contentbased constraints with the only difference the fact that the predicate is evaluated on a specific property of the independent security object yielding a classification of the properties of the associated dependent security object. Definition. Let 0, 0’ be two security objects and assume that the existence of an instance o of 0 is dependent on the existence of a corresponding occurrence 0’of 0’,where the k values of the identifying property K ‘ for 0’ are identical to k values of the characteristic properties of o (foreign key). Let P(0’) be a valid predicate (in the sense of the contentbased constraints) defined on 0‘and let X E ( A ,, ...,A,) be an attribute set of 0. A complex security constraint CoC is a security classification of the form CoC ( O ( X ) ,P ( 0 ’ ) )= C (C E SL). For every instance o of security object O(A , ...,A,) for which the predicate evaluates true in the related object 0’ of 0’, a transformation to o(al ,cl , ...,a,, c,, , tc) is performed. Classifications are assigned in such a way that ci = C if Ai E X , otherwise ci is unchanged.

,

SCL predicate. CoC (OD, X , 0, A , 8, V , C ) , where OD is the dependent security object under consideration, X the set of characteristic properties of OD which are to be classified, A the evaluated characteristic property A, of 0’,8 the comparison operator, V the comparison value a or characteristic property Aj of 0’,and C the security level desired.

Example and graphical representation. Individual assignment data (SSN) are regarded as secret information if the assignment refers to a project with Subject = ‘research’. CoC (Assignment, (SSN], Project, Subject, ‘ = ’, ‘Research’, S )

DATABASE SECURITY

53

FIG. 9. Graphical representation of complex constraint.

4.2.4 Level-Based Constraints Level-based security constraints are constraints classifying characteristic properties based on the classification of certain other properties of the same security object. This signifies that for all instances of a security object, the particular characteristic properties are always required to be at the same security level. Definition. Let level (Ai) be a function that returns the classification ciof the value of characteristic property A , in the object o ( a l ,c l , . ..,a,, c, , tc) of a multilevel security object 0".Let X be the set of characteristic properties of 0" such that x E ( A , ,. . . , A , ] . A level-based security constraint LbC is a classification of the form LbC(O(X))= level(Ai) and for every object o ( a , , cl, ..., a,, c,, tc) results in the assignment cj = ci if Aj E X .

SCL predicate. LbC (0,X , A ) , where 0 is the security object under consideration, X the set of characteristic properties to be classified, and A the governing characteristic property. Example and graphical representation. The Property Client of security object Project must always have the same classification as the property Subject of the Project.

LbC (Project, [Client], Subject) While the constraints which we have considered classify characteristic properties of security objects, the following additional constraints classify the retrieval results. This is necessary, since security may require that the sensitivity of the result of a query be different from the classifications of the constituent security objects. By this policy we respond to the logical association, aggregation, and logical inference problems.

54

GUNTHER PERNUL

Project

& FIG. 10. Graphical representation of level-based constraint.

4.2.5 Association-Based Constraints Association-based security constraints restrict against combining the value of certain characteristic properties with the identifying property of the security object in the retrieval result. This permits access to collective data but prevents the user from relating properties to individual instances of the security object. Definition. Let O(A,, ...,A,) be a security object with identifying property K. Let X C [ A ...,A,) ( K n X = [ )) be the set of characteristic properties of 0. An association-based security constraint AbC is a classification of the form AbC(O(K,X ) ) = C (C E SL) and results in the assignment of security level C to the retrieval result of each query that takes X together with the identifying property K.

SCL predicate. AbC (0,X,C ) , where 0 is the security object under consideration, X the set of characteristic properties to be classified when retrieved together with the identifying property, and C the security level.

L

FIG. 11. Graphical representation of association-based constraint.

DATABASE SECURITY

55

Example and graphical representation. The example considers the salary of an individual person as confidential while the value of salaries without information as to which employee gets what salary as unclassified. AbC (Employee, (Salary), Co)

4.2.6 Aggregation Constraints Under certain circumstances a combination of several inst nc of t h : same security object may be regarded as more sensitive than a query result consisting of a single instance only. This phenomenon is known as the aggregation problem. It occurs in cases where the number of instances of a query result exceeds some specified constant value. Definition. Let count(0) be a function that returns the number of instances referenced by a particular query and belonging to security object 0 ( A , , ...,A,,). Let X (X C ( A , , ...,A,)) be the sensitive characteristic properties of 0. An aggregation security constraint AgC is a statement of the form AgC (O,(X,count(0 > n)) = C (C E SL, n E N) and results in a classification C for the retrieval results of a query if count(0) > n, i.e., if the number of instances of 0 referenced by a query accessing properties X exceeds the value n. SCL predicate. AgC (0,X, N , C), where 0 is the security object under consideration, X the set of characteristic properties, N the specified value n, and C the security level of the corresponding queries.

Example and graphical representation. The information as to which employee is assigned to what projects is considered unclassified. However, aggregating all assignments for a certain project and, thereby, inferring

FIG. 12. Graphical representation of aggregation-based constraint.

56

GUNTHER PERNUL

which team (aggregate of assigned employees) is responsible for what project is considered secret. To treat this situation a maximum value of n = 3 should be specified. AgC (Assignment, (Title), '3', S )

4.2.6

Inference Constraints

Inference constraints restrict against the use of unclassified data to infer data which is classified. Inferences can occur because of hidden paths that are not explicitly represented in the conceptual data model of the multilevel application. The hidden paths may also involve knowledge from outside the database application domain. Definition. Let PO be the set of multilevel objects involved in a potential logical inference. Let 0, 0' be two particular objects from PO with corresponding multilevel representation 0 ( A , ,C , , ...,A , , C,, TC) and 0' ( A ; ,Ci, ..., A h , Ch, T C ' ) . Let X S ( A , , ...,A , ) and Y C (A\, ..., A h ] . A logical inference constraint rfC is a statement IfC ( O ( X ) ,O ' ( Y ) )= C and results in the assignment of security level C to the retrieval result of each query that takes Y together with the properties in X . SCL predicate. If C (01,X1, 02, X 2 , C ) , where 0 1 is the first security object involved, X1 the set of characteristic properties of 0 1 that might be used for logical inference, 0 2 the second security object, X 2 the attribute set of 02, and C the security level of the corresponding queries.

Example and graphical representation. As an example consider a situation in which the information as to which employee is assigned to what projects is considered confidential. Consider, further, that on the basis of access to the department which an employee works for and access to the

Fro. 13. Graphical representation of inference constraint.

DATABASE SECURITY

57

subject of a project, users (with certain knowledge from outside the system) may infer which department is responsible for the project, and, thus, can determine which employees are involved. The situation is modeled below. IfC (Employee, IDep], Project, (Subject], Co)

4.3

Consistency and Conflict Management

The classification constraints specified by the designer must be stored in a rule base. For complex applications it might be necessary to express a large set of security constraints at the conceptual database level. Verifying the consistency of the constraints is one of the more difficult design tasks. We propose that an automated tool which dynamically assists the designer in specification and refinement of the security constraints be applied here. The tool must ensure that the consistency of the rule base is satisfied whenever a classification constraint is updated or a new constraint inserted in the rule base. In the proposed conceptual model for multilevel security two types of conflicts are distinguished. The first type is concerned with conflicts among application-dependent and application-independent constraints. Because we are expressing the security semantics in the conceptual schema, application-independent multilevel constraints could be violated. In the proposed system, these conflicts are detected automatically, the conflicts are resolved, and, finally, the designer is notified. However, if an application-dependent security constraint is in conflict with an applicationindependent constraint, the designer does not have a chance to override the changes performed by the tool. The second kind of conflict deals with conflicting application-dependent security constraints. The designer is informed of such conflicts and then decides on the correct classification. As a default strategy, the tool suggests the maximum of the conflicting security levels to guarantee the highest degree of security possible. The following is the set of integrity constraints which the set of classification constraints must satisfy:

[Ill: Multilevel Integrity. Each property must have a security level. This is satisfied, since in initial classifying, all properties are assigned to the default security level. [I2]: Entity Integrity. All properties forming an identifying property must be uniformly classified and must be dominated by all the other classification of the object. The tuple-class must dominate all classifications. A multilevel security object 0" with identifying property K (apparent key) satisfies entity integrity property if for all occurrences

58

GUNTHER PERNUL

o ( q , c1 ,

...,a, ,c, ,tc) of 0"

1. A i , Aj E K * ci = cj 2. Ai E K , Aj 6 K * ci 5 cj 3. tc 2 ci (i = l..n).

[I3]: Foreign-Key Property. The level assigned to a foreign key must dominate the level of the corresponding identifying property. The foreign-key property guarantees that no dangling references between depending objects will occur. Let K be the identifying property in the multilevel security object 0" ( A , , C1, ...,A , , C, , TC) and let it be a foreign key K' in a dependent object 0'"( A ; , C ; , ...,A ; , CL, TC'). The foreign-key property is satisfied if, for any two dependent occurrences o(al ,cl, ...,a,, c, , tc) o f 0" and o'(ai ,c;, ...,a;, c;, t c ' ) of O'", Ai

E K,

A;

E

K' * ci 5 cj'.

[I4]: Near-Key Property. The near-key property is important if an association-based constraint A X (0,X , C ) is specified. In this case C is also propagated to each query that takes a near key instead of the identifying property of 0 together with the attribute set X . [IS]: Level-Based Property. In order to avoid transitive propagation of security levels between specified level-based constraints for any two constraints LbC(0, X , A ) and LbC(0,X ' , A ' ) A 6 X ' and A' 6 X must hold. Additionally, because of entity integrity, a LbC may not be defined on an attribute set including the identifying property. [I61: Multiple-Classification Property. Each value of a characteristic property may have only a single classification. If different security constraints assign more than one level to a particular property value, the conflict the designer must be notified. The designer then decides whether or not t o adopt the default resolution of the strategy.

4.4

Modeling the Example Application

Classifying is performed by stepwise insertion of security constraints into the rule base. Declaring a new constraint is an interactive process between tool and designer whereby each constraint is validated against the integrity constraints. If a conflict is detected which violates an application-independent integrity constraint, the constraint is enforced by propagating the required classification to the characteristic properties involved. If a conflict is due to multiple classification, the designer is told of the conflict and decides whether or not to adopt the default resolution strategy. Let us now apply the classification requirements to the sample design. For the sake of

DATABASE SECURITY

59

convenience, the corresponding rules specified in SCL are given below once again. 1. 2. 3.

S i c (Assignment, (Function], S ) CbC (Employee, (SSN, Name], Salary, ‘>’, ‘loo’, Co) CoC (Assignment, {SSN], Project, Subject, ‘=’, ‘Research’, S ) 4. LbC (Project, (Client], Subject) 5. AbC (Employee, (Salary], Co) 6. AgC (Assignment, [Title), ‘3’, S ) 7a. SIC (Assignment, (SSN, Title), Co) 7b. IfC (Employee, (Dep), Project, [Subject], Co) Classifying starts with the assignment of the default classification level to every characteristic property. Insertion of rule 1 results in the assignment of S to property Function. No conflicts result. Insertion of rule 2 leads to the assignment of the range [@..Co] to properties SSN and Name of Employee. That is, if the predicate evaluates true, Co is assigned to the properties, otherwise the classification remains Because of the application-independent integrity conpublic (denoted 0). straint, which specifies that the classification of the identifying property must be dominated by all other classifications of an object, the insertion of this CbC causes a violation of entity integrity. As a consequence, the classification range [ @..Co] is automatically propagated to the other properties of the object-type Employee as well. The identifying property of Employee (i.e., SSN) is also a foreign key in Assignment. Because of the foreign-key property, [@..Co] must also be propagated to SSN of Assignment. There, classifying SSN with [ 0. .Co] violates entity integrity, causing, first, propagation of [@..Co]of the property Title (the key must .Co] to the be uniformly classified) and, second, propagation of [ 0. property Date and Function as well (all other classifications must dominate the key). Since property Function is already assigned to S , the first conflict arises and is told to the designer. Let us assume the designer confirms the suggested classification and Function remains classified at S . No further conflicts arise. The complex security constraint specified as rule 3 states that SSN of Assignment is considered at S if an assignment refers to a project with Subject = ‘research’. Insertion of the constraint in the rule base causes a Co] is already assigned to SSN multiple-classification conflict, because [ 0.. of Assignment. Let us assume that the designer accepts the suggested default resolution strategy, so that [ @ . . S ] is assigned to SSN. Since the key must be uniformly classified, this causes a conflict with entity integrity and [ @ . . S ] is propagated to property Title as well. Because of the demand that

60

GUNTHER PERNUL

FIG. 14. State of design following application of constraint 3.

classification of an identifying property must dominate all other classifications of the object, [@..S] is also propagated to Date and Function. Propagating [ 0. . S ] to attribute Function causes a multiple-classification conflict. This is because rule 1 already has assigned a classification S . The designer is notified of the conflict. Let us assume that the designer confirms the suggested default resolution strategy and S remains assigned. Figure 14 shows the state of design after conflict resolution and before insertion of constraint 4. Introducing the level-based constraint specified in rule 4 does not cause any conflicts. Inserting the association-based constraint specified in rule 5 causes a violation of the near-key integrity property. The conflict is resolved by including the near-key integrity property in the constraint. Inserting rule 6 does not cause any conflicts. Rule 7a leads to multiple classification because SSN and Title of Assignment are already classified at [ @ . . S ] . Let us assume that the designer accepts the default conflictresolution strategy [Co..S]. Because of the need to enforce entity integrity this causes propagation of [Co..S] to all the other properties of Assignment as well. In the case of the property Function, a conflict arises because Function is already assigned to S . We again assume that the designer has accepted the suggested resolution strategy. Finally, the inference constraint (rule 7b) which classifies certain query results is included in the conceptual model. Figure 15 gives a graphical representation of the conceptual data model of the sample multilevel application following classification and conflict resolution. An optional implementation of the graphical browser should provide a tracing facility, giving the designer the ability to trace back all the classification steps which have led to certain classifications. The contribution of this section is to develop a semantic data model for multilevel security. The model provides an integrated approach for modeling both the data and the security semantics of a database application. The proposal made in this section extends previous work on semantic modeling of sensitive information by carefully defining the security semantics

DATABASE SECURITY

61

FIG. 15. Conceptual model of the sample database.

considered, providing a constraint language and a graphical notion to express the semantics in a conceptual model, and developing consistency criteria which the set of specified classification constraints must satisfy. The technique can be extended in several directions. In the case of certain database applications, for example, it may also be necessary to model the dynamic aspects of information. A first step in this direction has already been taken by Burns (1992) and Pernul (1992b). The model also has to be completely implemented. So far the implementation is only at the prototype level and covers only the constraints language SCL and conflict management. Implementation of the graphical browser is left for further study. Another important issue to the database community is deciding when to enforce the security constraints represented in the conceptual representation of the database. In general, security constraints may be enforced during database update, during query processing, as well as during database design. If the constraints are handled during database update, they are treated by DMBS like the integrity constraints. If they are enforced during query processing, they may be treated like the derivation rules, that is, employed to assign classifications before data is released from the DBMS domain to the user domain. Finally, if they are handled during the database design phase, they must be properly represented in the database structure and in the metadata. Deciding when to enforce the constraints may depend on the type of constraint being considered. However, it is important to note that enforcing the constraints during query processing or during database update will strongly influence the performance of the database. From this point of view as many constraints as possible should be enforced during the design of the database. The technique proposed in this section serves as a valuable starting point for a logical design stage during which the conceptual representation of the database is transferred into a target data model, for example, the multilevel relational data model.

62

GUNTHER PERNUL

5.

Standardization and Evaluation Efforts

Database security (and computer security in general) is currently subject to intensive national and international standardization and evaluation efforts. The efforts have as their goal the development of metrics for use in evaluating the degree of trust that can be placed in computer products used to process sensitive information. By “degree of trust,” we understand the level of assurance that the security enforcing functions of a system are working properly. The efforts have all been based on the “Orange Book” criteria (TCSEC, 1985) issued by the U.S. National Computing Security Center (NCSC). Since then, the criteria have been used to evaluate products in the U.S. and in many other countries as well. Shortly after its release, the Orange Book was criticized because of its orientation towards confidentiality and secrecy issues and because its main focus was on centralized computer systems and operating systems. As a consequence, NCSC has issued two interpretations of the Orange Book, the “Red Book,” an interpretation for networks, and the “Purple Book” (TDI, 1990), an interpretation for databases. Together with other documents issued by NCSC, the standards are known as the “rainbow series” because of the color of their title pages. Within Europe there have been a number of national initiatives in the development of security evaluation criteria. Recognizing the common interest and similar principles underlying their efforts, four European countries (France, Germany, Netherlands, and the United Kingdom) have cooperated in the development of a single set of harmonized criteria issued by the Commission of the European Communities (ITSEC, 1991). Besides these efforts, criteria sets have also been published in Canada and Sweden. Because of the ongoing internationalization of the computer product market, there is a strong demand on the part of industry for establishing harmonization between TCSEC, ITSEC, and the other proposals. A first step in this direction were the studies performed as part of the US Federal Criteria Project, currently a draft under public review. In the following discussion we will briefly review the basic concepts of the Orange Book and show how they relate to corresponding concepts in ITSEC. TCSEC defines four hierarchically ordered divisions (D, C, B, A) of evaluation classes. Within each of the division may be found one or more hierarchical classes. Figure 16, taken from the Orange Book, contains a detailed representation of this packaging. D-level criteria relate to all systems and products that cannot be evaluated at higher levels of trust. D-level requires no security features. Systems rated at a C-level Support DAC, which includes the support of identification, authentication, and auditing functions. At C1, DAC-based

DATABASE SECURITY

63

C I Cz B , Bz B, A,

Discretionary access control Object reuse Labels Label integrity Exportation of labelled information Exportation of multilevel devices Security polic) Exportation of single-level devices Labelling human-readable output Mandatory access controls Subject sensitivity labels Device labels Identificabon and authentication Accountability Audit Trusted paths System architecture System integrity 0 Security testing 0 Design specification and verification 0 Assurance Covert channel analysis Trusted facility management Configuration management Trusted recovery Trusted distribution Security features user’s guide 0 Trusted facility manual Docunientatior Test documentation, 0 Design documentauon I

00 0 0 0 0

0 No additional requirements for this class 0 New or enhanced requirements for this class No requirements for this class

Ftc. 16. Trusted Computer Security Evaluation Criteria summary chart. (NCSC-TCSEC, 1985).

protection must only be provided at a user-group level, while for C 2 , protection at the individual user level is required. Most commercially available general-purpose DBMS products are evaluated at C2. At the B-level criteria, security labels and mandatory access controls are introduced. Enhancing existing DBMSs with add-on security packages may result in evaluation at B, , whereas for B, and above the system must have been designed with security already in mind. At B2 emphasis is on assurance. For this purpose a formal security policy model must be developed, the role of a system administrator and an operator introduced, and security-relevant code separated into a TCB. B, requires an increased level of assurance, achieved by a greater amount of testing and placing great emphasis on auditing. Emphasis at B, is also directed toward minimizing and simplifying TCB code. The A, evaluation class is, in terms of functionality, identical to B,, though it requires formal techniques to exhibit and prove consistency

64

GUNTHER PERNUL

between the specification and the formal security policy. It is not required to prove the source code against the specification and against the formal security policy. The systems discussed in Section 3 were developed with the aim of obtaining evaluation at the A, level, whereas most commercial DBMS systems that support a mandatory security policy have been evaluated at the B, or B, level. A number of deficiencies in TCSEC have been pointed out by several researchers (for example, Neumann, 1992). Besides the fact that distributed systems are not adequately covered (although the Red Book provides some guidelines) it has been noted that The primary focus of TCSEC is on confidentiality. Integrity and availability are not treated adequately. a Authentication considers only passwords. More advanced techniques are not included. 0 TCSEC provides inadequate defence against pest programs (Neumann, 1990). 0 Auditing data (and its real-time analysis) can provide an important aid in protecting against vulnerabilities. This is not considered in the criteria. 0

ITSEC has been developed with some of the deficiencies of TCSEC in mind and is intended as a superset of TCSEC. It defines security as consisting in a combination of confidentiality, integrity, and availability, and distinguishes between two kinds of criteria: a functional criteria of ten hierarchically ordered divisions and a correctness criteria of seven divisions. Both criteria are evaluated separately. The functional criteria are used to evaluate the security enforcing functions of a system. The functional criteria have been developed within the German national criteria project. The first five functionality divisions correspond closely to the functionality classes of TCSEC while the remaining five are intended as examples to demonstrate common requirements for particular types of systems. The correctness criteria represent seven levels of assurance as regards the correctness of the security features. They correspond roughly to the assurance levels of TCSEC and cumulatively require testing, configuration control, access to design specification and source code, vulnerability analysis, and formal and informal verification of the correspondence between specification, security model, and source code. Figure 17 relates the functional and correctness criteria of ITSEC to the corresponding evaluation classes of TCSEC. Although it is commonly agreed that the evaluation criteria are a first step in the right direction, the market for commercial evaluation is still not fully

65

DATABASE SECURITY I '1. s IIC runciional corrcclncss

F-C 1 I:-c2

I:-u I F-B2 1:- u 3 IT-U3

I0 I' I

-1.c s 1; c

* =3

evalualion I) CI

t:2 1 3

3 =3

c2 UI

114

=3

B2

IJS

3

II6

=3

I33 /\I

FIG. 17. Correspondence between ITSEC and TCSEC.

developed. The existence of at least seven sets of evaluation criteria from different countries has produced an unwillingness on the part of developers to permit their products to be subjected to an evaluation process. However, it is commonly agreed that efforts at making the different criteria compatible, together with growing number of evaluated products and the increasing number of customers showing a preference for evaluated products, may generate further interest among the public and society at large in database security (and computer security in general) and security evaluation.

6.

Future Directions in Database Security Research

The field of database security has been active for almost twenty years. During early stages of research the focus was directed principally towards the discretionary aspect of database security, i.e., different forms of access control lists and view-based protection issues. Later the focus shifted towards mandatory controls, integrity issues, and security mechanisms fine-tuned to provide privacy. The major current trends are to provide tools that support the designer during the different database design phases that entail securitycritical contents, to develop security semantics and classification constraints, to investigate the use of rules and triggers for various problems related to database security, to extend security issues to other data models, for example, distributed and heterogeneous databases, and to investigate in the course of physical design such questions as transaction and recovery management as well as development of storage structures whose main focus is on the support of security. We now would like to outline what we believe will be the various directions the entire field will follow over the next few years.

66

GUNTHER PERNUL

System architecture of mandatory systems. Most DBMSs supporting MAC are based on the principles of balanced assurance and TCB subsetting. As a result, DBMS is hosted on a TCB which is responsible for identification, user authentication, and mandatory access controls. Multilevel relations are only supported at an external level and the entire database is decomposed into single-level fragments which are stored using the storage manager of a general-purpose DBMS product. We believe this approach has several practical advantages but represents only a near-term solution to database security. What is needed in the near future are data models, storage structures, and transaction and recovery management procedures specially suited for the use in DBMSs with a high degree of trust in their security features. A first step in this direction has already been taken in the case of secure transaction management (for example, Kogan and Jajodia, 1990, or Kang and Keefe, 1992a) and recovery management (Kang and Keefe, 1992b). Formal specification and verification MLS DBMSs. Assurance that the security features of a DBMS are working properly is required for DBMSs that contain databases with security-critical content. This entails a formal specification and verification of the DBMS specifications, the DBMS architecture, the DBMS implementation, as well as the design and implementation of the particular database application. So far, there is not much work on this topic and only very little experience in the use of existing systems and techniques to formally specify and verify databases. A natural next step would be to adopt existing techniques and use them for designing and implementing secure databases. A very good discussion on the pros and cons of formal methods within the framework of safety-critical systems is that of McDermid (1993). Evaluation criteria. It is commonly agreed that the evaluation criteria represent a first step in the right direction. However, since the international field of information technology providers will not be able to evaluate their products against different criteria in different countries, all the various criteria will have to be merged. Mutual recognition of the security certifications and evaluations of different countries is also necessary. Moreover, as technology evolves, the concept of security will have to be extended to an open, heterogeneous, multi-vendor environment. In the future, systems will have to be considered for evaluation that differ from what we are familiar with today. For example, object-oriented systems, knowledge-based systems, active systems, multimedia systems, or hypertext may become candidates for evaluation. To cover future development, criteria must be open-ended and, thereby, address the needs of new information technology environments which have yet to be explored.

DATABASE SECURITY

67

Extending security to nonrelational data models. It is only recently that security has been discussed in the context of nonrelational data models. Preliminary work has begun on the development of security models for object-oriented databases (for multilevel approaches, see Keefe et al., 1989, Jajodia and Kogan, 1990, Thuraisingham, 1992, and Millen and Lunt, 1992; for discretionary models, see Fernandez et al., 1989, Rabitti et al., 1989, and Fernandez et al., 1993); for knowledge-based systems, see Morgenstern, 1987, and Thuraisingham, 1990; for multimedia databases, see Thuraisingham, 1991; and for hypertext, see Merkl and Pernul, 1994). So far, the Personal Knowledge Approach is the only data model that was initially developed with the main goal of meeting security requirements. All the other approaches have adopted existing data models for use in securitycritical environments. It is expected that further research will lead to new data models in which security is among the major design decisions. Research issues in discretionary security. The presence of more advanced data models, for example, the object-oriented data model, has renewed interest in discretionary access controls. Further research issues include explicit negative authorization, group authorization, propagation of authorization, propagation of revocations, authorizations on methods and functions, and the support of roles. Design aids and tool. Future research is necessary for the development of aids and tools to support the designer during the different phases involved in the design of a database with security-critical content. Research is needed in an integrated fashion and must span requirements analysis, conceptual and logical design, security semantics, and integrity rules, as well as prototyping, testing, and benchmarking. Aids, guidelines, and tools are needed for both discretionary and mandatory protected databases. Extending security to distributed and heterogeneous databases. Distribution adds a further dimension to security because distributed systems are vulnerable to a number of additional security attacks, for example, data communication attacks. Even more complicated is the case in which heterogeneous DBMSs are chosen to form a federation. Since the participating component databases continue to operate autonomously and the security mechanisms may differ between the sites, additional security gateways and controls may be necessary. The steps involved in building a secure distributed heterogeneous DBMS are by no means straightforward and some researchers believe that, given the current state of the art of both database security and federated database technology, such a DBMS is not even possible.

68

GUNTHER PERNUL

Security and privacy. Addressing security and privacy themes must remain a future topic of database research. Security and privacy is among the most important topics in medical informatics, for example, in integrated hospital information systems. In numerous medical venues computerized information systems have been introduced with little regard to security and privacy controls. It is a future challenge to database security to cope with the availability, confidentiality, and privacy of computer-based patient records in the near future.

7 . Conclusions In the present essay we have proposed models and techniques which provide a conceptual framework in the effort to counter the possible threats to database security. Emphasis has been given to techniques primarily intended to assure a certain degree of confidentiality, integrity, and availability of the data. Privacy and related legal issues of database security were also discussed, though not as fully. Although our main focus was on the technological issues involved in protecting a database, it should be recognized that database security includes organizational, personnel, and administrative security issues as well. Database security is not an isolated problem-in its broadest sense it is a total system problem. Database security depends not only on the choice of a particular DBMS product or on the support of a certain security model, but also on the operating environment and the people involved. Although not discussed, further database security issues include requirements on the operating system, network security, add-on security packages, data encryption, security in statistical databases, hardware protection, software verification, and others. There is a growing interest in database security and the approaches which we have reported demonstrate the considerable success which has been achieved in developing solutions to the problems involved. Public interest has increased dramatically, though it is only recently that the issue of security outside the research community has begun to receive the attention which its importance warrants. Though database security has been a subject of intensive research for almost two decades it is still one of the major and fascinating research areas. It is expected that changing technology will introduce new vulnerabilities to database security. Together with problems that have yet to be fully solved, the field of database security promises to remain an important area of future research.

DATABASE SECURITY

69

ACKNOWLEDGMENTS I wish to acknowledge the many discussions that 1 have had on the AMAC security technique and on the conceptual modeling of sensitive information with Kamal Karlapalem, Stefan Vieweg, and Werner Winiwarter. In particular, I wish to thank A Min Tjoa and Dieter Merkl for their many fruitful comments.

References Batini, C., Ceri, S., and Navathe, S. B. (1992). “Conceptual Database Design: An EntityRelationship Approach.” BenjaminICummings, Reading, Massachusetts. Bell, D. E. and LaPadula, L. J. (1976). “Secure Computer System: Unified Exposition and Multics Interpretation.” Technical Report MTR-2997. MITRE Corp., Bedford, Massachusetts. Biba, K. J . (1977). “Integrity Considerations for Secure Computer Systems.” ESD-TR-76372, USAF Electronic Systems Division. Biskup, J. (1990). “A General Framework for Database Security.” Proc. European Symp. Research in Computer Security (ESORICS ’90), Toulouse, France. Biskup, J. and Briiggemann, H. H. (1988). The Personal Model of Data: Towards a PrivacyOriented Information System. Computers & Security, 7 , North-Holland (Elsevier). Biskup, J., and Bruggemann, H. H. (1989). The Personal Model of Data: Towards a PrivacyOriented Information System (extended abstract). Proc. 5th Int’l Conf. on Data Engineering (ICDE ’89). IEEE Computer Society Press. Biskup, J . and Bruggemann, H. H. (1991). Das datenschutzorientierte Informationssystem DORIS: Stand der Entwicklung und Ausblick. Proc. 2. GI-Fachtagung “VerlaJliche Informationssysteme (VIS ’91). IFB 271, Springer-Verlag. Burns, R. K. (1992). A Conceptual Model for Multilevel Database Design. Proc. 5th Rome Laboratory Database Workshop, Oct. 1992. Chen, P. P. (1976). The Entity Relationship Model: Towards a Unified View of Data. ACM Trans. Database Systems (TODS), 1(1). Clark, D. D. and Wilson, D. R. (1987). A Comparison of Commercial and Military Computer Security Policies. Proc. 1987 Symp. “Research in Security and Privacy. ” IEEE Computer Society Press. Codd, E. F. (1970). A relational model for large shared data banks. Comm. ACM, 13(6). Cuppens, F. and Yazdanian, K. (1992). A “Natural” Decomposition of Multi-level Relations. Proc. 1992 Symp. Research in Security and Privacy. IEEE Computer Society Press. Denning, D. E. (1988). Database Security. Ann. Rev. Comput. Sci. 3. Denning, D. E., Lunt, T. F., Schell, R. R., Heckman, M., and Shockley, W. R. (1987). A multilevel relational data model. Proc. 1987 Symp. Research in Security and Privacy. IEEE Computer Society Press. Denning, D. E., Lunt, T. F., Schell, R. R., Shockley, W. R., and Heckman, M. (1988). The SeaView Security Model. Proc. 1988 Symp. Research in Security and Privacy. IEEE Computer Society Press. Elmasri, R. and Navathe, S. B. (1989). “Fundamentals of Database Systems.” Benjamin/ Cummings, Reading, Massachusetts. Fernandez, E. B., Summers, R. C., and Wood, C. (1981). “Database Security and Integrity.” (System Programing Series) Addison-Wesley , Reading, Massachusetts.

70

GUNTHER PERNUL

Fernandez, E. B., Gudes, E., and Song, H. (1989). A Security Model for Object-Oriented Databases. Proc. 1989 Symp. Research in Security and Privacy. IEEE Computer Society Press. Fernandez, E. B., Guides, E., and Song, H. (1993). AModel for Evaluation and Administration of Security in Object-Oriented Databases. IEEE Trans. Knowledge and Data Engineering (forthcoming). Fugini, M. G. (1988). Secure Database Development Methodologies. I n “Database Security: Status and Prospects,” C. Landwehr, ed. North-Holland (Elsevier). Garvey, C. and Wu A. (1988). ASD-Views. Proc. 1988 Symp. Research in Security and Privacy. IEEE Computer Society Press. Graham, G. S. and Denning, P. J. (1972). Protection Principles and Practices. Proc. AFIPS Spring Joint Computer Conference. Griffiths, P. P. and Wade, B. W. (1976). An authorization mechanism for a relational database system. ACM Trans. Database Systems (TODS) l(3). Haigh, J. T., O’Brien, R. C., Stachour, P. D., and Toups, D. L. (1990). The LDV Approach to Database Security “Database Security 111: Status and Prospects,” D. L. Spooner and C. Ladwehr, eds. North Holland (Elsevier). Harrison, M. A., Ruzo, W. L., and Ullman, J. D. (1976). Protection in operating systems. Comm. ACM 19(8). Hinke, T. H., Garvey, C., and Wu A. (1992). A1 Secure DBMS Architecture. I n “Research Directions in Database Security,” T. F. Lund, ed. Springer-Verlag. ITSEC (1991). Information Technology Security Evaluation Criteria (ITSEC). Provisional Harmonized Criteria, COM(90) 314. Commission of the European Communities. Jajodia, S. and Kogan, B. (1990). Integrating an Object-Oriented Data Model with Multilevel Security. Proc. 1990 Symp. Research in Security and Privacy. IEEE Computer Society Press. Jajodia, S. and Sandhu, R. (1990a). Database Security: Current Status and Key Issues. ACM SIGMOD Record 19(4). Jajodia, S . and Sandhu, R. (1990b). Polyinstantiation Integrity in Multilevel Relations. Proc. 1990 Symp. Research in Security and Privacy. IEEE Computer Society Press. Jajodia, S., Sandhu, R., and Sibley, E. (1990). Update Semantics of Multilevel Secure Relations. Proc. 6th Ann. Comp. Security Application Conf. (ACSAC ’90). IEEE Computer Society Press. Jajodia, S. and Sandhu, R. (1991a). Toward a multilevel secure relational data model. Proc. ACM SIGMOD Conf. Denver, Colorado. Jajodia, S. and Sandhu, R. (1991b). A Novel Decomposition of Multilevel Relations into Single-Level Relations. Proc. 1991 Symp. Research in Security and Privacy. IEEE Computer Society Press. Jajodia, S. and Mukkamala, R. (1991). Effects of the SeaView decomposition of multilevel relations on database performance. Proc. 5th IFIP WG 11.3 Conf. Database Security. Stepherdstown, West Virginia. Kang, 1. E. and Keefe, T. F. (1992a). On Transaction Processing for Multilevel Secure Replicated Databases. Proc. European Symp. Research in Computer Security (ESORICS ’92). LNCS 648, Springer-Verlag. Kang, 1. E. and Keefe, T. F. (1992b). Recovery Management for Multilevel Secure Database Systems. Proc. 6th IFIP WG 11.3 Conf, on Database Security. Vancouver, British Columbia. Keefe, T. F., Tsai, W. T., and Thuraisingham, M. B. (1989). Soda-A secure Object-Oriented Database System. Computers & Security B(5). North-Holland (Elsevier). Kogan, B. and Jajodia, S. (1990). Concurrency Control in Multilevel Secure Databases using the Replicated Architecture. Proc. ACM SIGMOD Conf. Portland, Oregon.

DATABASE SECURITY

71

Lampson, B. W. (1971). Protection. Proc. 5th Princeton Conf. Information and Systems Sciences. Lampson, B. W. (1973). A Note on the Confinement Problem. Comm. ACM 16(10). Landwehr, C. E. (1981). Formal Models of Computer Security. ACM Cornp. Surveys 13(3). Lunt, T. F., Schell, R. R., Shockley, W. R., and Warren, D. (1988). Toward a multilevel relational data language. Proc. 4th Ann. Comp. Security Application Conf. (ACSAC ’88). IEEE Computer Society Press. Lunt, T. F., Denning, D. E., Schell, R. R., Heckman, M., and Shockley, W. R. (1990). The SeaView Security Model. IEEE Trans. Software Engineering (ToSE) 16(6). Lunt, T. F. and Fernandez, E. B. (1990). Database Security. ACM SIGMOD Record 19(4). McDermid, J. A. (1993). Formal Methods: Use and Relevance for the Development of Safety-critical Systems. I n “Safety Aspects of Computer Control,” P. Bennett, ed. Butterworth-Heinemann. Merkl, D. and Pernul G. (1994). Security for Next Generation of Hypertext Systems. Hypermedia 6(1) (forthcoming). Taylor Graham. Millen, J . K. (1989). Models of Multilevel Computer Security. Advances in Computers 29 (M. C. Yovitis, ed.). Academic. Millen, J. K. and Lunt, T. F. (1992). Security for Object-Oriented Database Systems. Proc. 1992 Syrnp. Research in Security and Privacy. IEEE Computer Society Press. Morgenstern, M. (1987). Security and Inference in Multilevel Database and Knowledge-based Systems. proc. ACM SIGMOD Conf. San Francisco, California. Navathe, S. B. and Pernul, G. (1992). Conceptual and Logical Design of Relational Databases. Advances in Computers 35 (M. C. Yovitis, ed.). Academic Press. Neumann, P. G. (1990). Rainbow and Arrows: How the Security Criteria Address Computer Misuse. Proc. 13th National Computer Security Conference. IEEE Computer Society Press. Neumann, P. G. (1992). Trusted Systems. In “Computer Security Reference Book,” K. M. Jackson and J. Hruska, eds. Butterworth-Heinemann. Pernul, G. and Tjoa, A. M. (1991). A View Integration Approach for the Design of Multilevel Secure Databases. Proc. 10th Int’l Conf. Entity-Relationship Approach (ER ’91). San Mateo, California. Pernul, G. and Luef, G. (1991). A Multilevel Secure Relational Data Model Based on Views. Proc. 7th Ann. Cornp. Security Applications Conf. (ACSAC ’91). IEEE Computer Society Press. Pernul, G. (1992a). Security Constraint Processing in Multilevel Secure AMAC Schemata. Proc, European Symp. Research in Computer Security (ESORICS ’92). LNCS 648, Springer-Verlag. Pernul, G. (1992b). Security Constraint Processing During MLS Database Design. Proc. 8th Ann. Comp. Security Applications Conf. (ACSAC ’92). IEEE Computer Society Press. Pernul, G. and Luef, G. (1992). A Bibliography on Database Security. ACMSIGMOD Record 21(1).

Pernul, G . and Tjoa, A. M. (1992). Security Policies for Databases. Proc. IFACSyrnp. Safety and Security of Computer Systems (SAFECOMP ’92). Pergamon Press. Pernul, G., Winiwarter, W., and Tjoa, A. M. (1993). The Entity-Relationship Model for Multilevel Security. Institut fur Angewandte Informatik und Informationssysteme. Universitat Wien. Rabitti, F., Bertino, E., Kim, W., and Woelk, D. (1991). A Model of Authorization for Nextgeneration Database Systems. ACM Trans. Database Systems (TODS) 16(1). Rochlis, J. A. and Eichin, M. W. (1989). With Microscope and Tweezers: The Worm from MIT’s Perspective. Comm. ACM 32(6).

72

GUNTHER PERNUL

Schell, R. R., Tao, T. F.,and Heckman, M. (1985). Designing the Gemsos Security Kernel for Security and Performance. Proc. 8th Nat’l. Computer Security Conference. IEEE Computer Society Press. Sell, P. J. (1992). The SPEAR Data Design Method. Proc. 6th IFIP WG 11.3 Conf. Database Security. Burnaby, British Columbia. Smith, G. W. (1990a). The Semantic Data Model for Security: Representing the Security Semantics of an Application. Proc. 6th Int’l Conf. Data Engineering (ICDE ’90). IEEE Computer Society Press. Smith, G. W. (1990b). Modeling Security Relevant Data Semantics. Proc. 1990 Symp. Research in Security and Privacy. IEEE Computer Society Press. Smith, K., and Winslett, M. (1992). Entity Modeling in the MLS Relational Model. Proc. 18th Conf. Very Large Databases (VLDB ’92). Stachour. P. D. and Thuraisingham, B. (1990). Design of LDV: A Multilevel Secure Relational Database Management System. IEEE Trans. KDE 2(2). Stoll. C. (1988). Stalking the Wily Hacker. Comm. ACM 31(5). Stonebraker, M. and Rubinstein, P. (1976). The Ingres Protection System. Proc. 1976 ACM Annual Conference. TCSEC (1985). Trusted Computer System Evaluation Criteria. (Orange Book). National Computer Security Center, DOD 5200.28-STD. TDI (1990). Trusted Database Interpretation of the Trusted Computer System Evaluation Criteria. NCSC-TG-021. Version 1. Thompson, K. (1984). Reflections on Trusting Trust. Comm. ACM 27(8). (Also in ACM Turing Award Lectures: The First Twenty years 1965-1985. ACM Press.) Thomsen, D. J. and Haigh, J. T. (1990). A Comparison of Type Enforcement and Unix Setuid Implementation of Well-Formed Transactions. Proc. 6th Ann. Comp. Security Applications Conf. (ACSAC ’90). IEEE Computer Society Press. Thuraisingham, M. B. (1990). Towards the design of a secure data/knowledge base management system. Data & Knowledge Engineering 5(1), North-Holland (Elsevier). Thuraisingham, M. B. (1991). Multilevel Security for Multimedia Database Systems. I n “Database Security: Status and Prospects IV,” S. Jajodia and C. E. Landwehr, eds. North-Holland (Elsevier). Thuraisingham, M. B. (1992). Multilevel Secure Object-Oriented Data Model-Issues on noncomposite objects, composite objects, and versioning. JOOP, SIGS Publications. Wilson, J. (1988). A Security Policy for an A l DBMS (a Trusted Subject). Proc. 1988 Symp. Research in Security and Privacy. IEEE Computer Society Press. Wiseman, S. (1991). Abstract and Concrete Models for Secure Database Applications. Proc. 5th IFIP WG 11.3 Conf. Database Security. Stepherdstown, West Virginia.