Preventing Illicit Information Flow in Networked
Computer Games using Security Annotations
Jonas Rabbe, s991125
Kgs. Lyngb...
2
Abstract
In networked computer games using a client-server structure, bugs that re-
sult in information exposure can be us...
4
Resum´e
Fejl som resulterer i informationseksponering kan i client-server baserede
computerspil, som kommunikerer over et ...
6
Preface
This report describes and documents the M.Sc. Thesis Project of Jonas Rabbe.
The project corresponds to 30 ECTS po...
8
Contents
1 Introduction 17
1.1 Problem Specification . . . . . . . . . . . . . . . . . . . . . 18
1.1.1 Final Specification ...
4.2 Type System for Security Annotations. . . . . . . . . . 46
4.2.1 The Block Label. . . . . . . . . . . . . . . . . . . ...
D Test Results 105
D.1 Type System . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
D.2 TMCA . . . . . . . . . . ...
12
List of Tables
2.1 The simple rule for verifying declassification. . . . . . . 27
2.2 A more complex declassification type r...
14
List of Figures
3.1 Syntax of expression in the gWhile language. . . . . . . 32
3.2 Syntax of statements in the gWhile lan...
5.15 Formats for the functions that alter γ . . . . . . . . . . . 68
5.16 Data types for fields in the TMCA tuple. . . . . ...
CHAPTER 1
Introduction
More and more computer programs communicate over an insecure network
such as the Internet. These pr...
with a domain specific to the game. This language can be designed with the
annotations in place to allow the program to be ...
Security Annotated Code
This title seemed in my mind quite focussed on the realization aspect of
the initial problem speci...
The playing field for Battleships consists of a grid where each grid inter-
section can be addressed by a unique coordinate...
CHAPTER 2
The Decentralized Label Model
This chapter describes the Decentralized Label Model which is one example
of secur...
specify how their values may flow. The value of a variable may flow into
another variable if the flow constitutes a restricti...
then
owners(L) = {A, C}
Similarly, readers(L, O) returns a set containing the principals that are
readers for a given owne...
Definition of L1 = L2
owners(L1) = owners(L2)
∀O ∈ owners(L1), readers(L1, O) = readers(L2, O)
2.1.1 Restriction as a Parti...
Anti-symmetry is again shown by looking at the conditions of the re-
striction operator. For the first condition this is
ow...
It is clear from the proof of the binary join operator above and the
associated traits that S finds the least upper bound o...
2.2 Declassification
As mentioned above, the Decentralized Label Model allows data to flow
as long as it becomes more restri...
Since the current process is the process of player 1, ρ has the value A
and the type rule for the declassification is verifi...
2.3 The Principal Hierarchy
Above in Section 2.2 the hierarchy of principals was briefly touched upon
through the notion of...
run in. Once the program has been compiled into machine code there is no
room for annotations. There are two problems asso...
CHAPTER 3
The gWhile Language
To allow annotations of the example program, Battleships, as described in
the previous chapt...
initial values to simplify the initializations compared with JIF.
In the definition of the language some notations are used...
are shown above–mainly in Figure 3.2–but are discussed in Section 3.2.
L ∈ Label
L ::= A : | A : R1, . . . , Rn | A : all ...
3.2 Cryptography and Communication
In this section I will explain the cryptography and communication primitives
in the gWh...
with the symmetric cryptography. This authority is not readily replicable
in the realm of asymmetric cryptography since th...
e ∈ Expr
e ::= {e1, . . . , en}{ak}
S ∈ Stmt
S ::= send(e1, . . . , en) | receive(n1, . . . , nj; xj+1, . . . , xn)
| decr...
Cryptography in Communication
• Advantages
– Everything sent is either encrypted or signed, so there is no need
to think a...
S ∈ Stmt
S ::= ssreceive(e1, . . . , ej; xj+1, . . . , xk){sk}
Figure 3.7: Simple symmetric receive statement
Since the ca...
CHAPTER 4
Type System and Analysis
With some familiarity with both the Decentralized Label Model as a model
for using secu...
τ ∈ Basic Type
τ ::= int | bool | principal | int × int → int
| τ1 × . . . × τn | τ1 × . . . × τn → crypt
| τ1 × . . . × τ...
name, for example γ = [x → v ] ∨ [x → v ], then it results in an error. In
the type system this is modeled by
dom([x → v ]...
their specific type. In the case of variables, (var), this is done by fetching
the type from γ. For (table) the type is fou...
cern cryptography and communication, and have a number of points worth
investigating. The asymmetric cryptographic send an...
(asend)
γ e1 : τ1 . . . γ en : τn
γ(k+
) = τ1 × . . . × τn → encrypt
γ asend(e1, . . . , en){k+
} : stm
(arec)
γ e1 : τ1 ....
(inum) γ x{L} := n : [x → int]
(iprinc) γ x{L} := ’A’ : [x → principal]
(itrue) γ x{L} := true : [x → bool]
(ifalse) γ x{L...
Table 4.8 show the rules for the key declarations. A key declaration,
(decl), declares a key format which is used by eithe...
The rules for statements, processes, and the system use the large types
defined for the plain system to indicate that they ...
4.2.2 Expressions
All expressions carry the label map, λ, and the set of current principals,
ρ. The label map is used for ...
Following the restriction operator it is not possible to let another principal
read data unless he is already in the effect...
table, t, which everyone can read where the contents are known, a variable
l with a low security level, and a variable h w...
(ssendL)
ρ; λ e1 : L1 . . . ρ; λ en : Ln
λ(k) = Lk1 × . . . × Lkn → Lk
(∀i ∈ [1, n])(B Li Lki)
B; ρ; λ ssend(e1, . . . , e...
for ssend, (ssendL), and likewise for areceive and ssreceive, with the
rules (arecL) and (ssrecL). Most of the type rule f...
(inumL) λ x{L} := n : [x → L]
(iprincL) λ x{L} := ‘A’ : [x → L]
(itrueL) λ x{L} := true : [x → L]
(ifalseL) λ x{L} := fals...
(kdL) declare d as {T1{L1}, . . . , Tn{Ln}}{L} :
[d → L1 × . . . × Ln → L]
(kdcombL)
KD1 : λ KD2 : λ
KD1; KD2 : λ ∨ λ
Tabl...
After this information has been collected, for each communication state-
ment, it is matched to verify that everything tha...
for process A and
while b do
ssreceive(x; b){k}
endwhile;
ssreceive(x; b){k}
for process B, then the loop in process B may...
CHAPTER 5
Implementation
To test the design of the language, type systems, and simple analysis, an
implementation of each ...
directly implement the abstract syntax using Lex and Yacc.
There have been other changes, however, that have been made to ...
type label = (string ∗ string list) list;
(∗ owner: reader1, reader2, ...; owner2: ... ∗)
Figure 5.1: Datatype for labels
...
datatype stmt =
ASS of expr ∗ expr
| SKIP
| SEQ of stmt ∗ stmt
| IF of expr ∗ stmt ∗ stmt
| WH of expr ∗ stmt
| ASEND of e...
symmetric receive and act for statement, the final string of that statement
is the principal which the current process want...
the format of the associated keys and is simply list with fields of the format
string ∗ label
The system contains a list of...
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations
of 112

Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations

Published on: Mar 4, 2016
Source: www.slideshare.net


Transcripts - Preventing Illicit Information Flow in Networked Computer Games Using Security Annotations

  • 1. Preventing Illicit Information Flow in Networked Computer Games using Security Annotations Jonas Rabbe, s991125 Kgs. Lyngby, March 2005 M.Sc. Thesis IMM-THESIS-2005-11 IMM Computer Science and Engineering Informatics and Mathematical Modeling Technical University of Denmark
  • 2. 2
  • 3. Abstract In networked computer games using a client-server structure, bugs that re- sult in information exposure can be used to cheat. A programming language allowing the specification of security annota- tions can be designed specifically for the domain of a given game. Using the classic game Battleships as an example, a language gWhile has been designed which allows annotations following the Decentralized Label Model. The gWhile language includes communication and cryptography for secure communications, as well as other primitives specific to Battleships. A type system has been designed to verify the information flow of pro- grams in gWhile with respect to the Decentralized Label Model. A simple analysis has also been designed, the Type Matching Communications Anal- ysis, which attempts to match communication statements in a program. Keywords security, language design, information flow controls, the De- centralized Label Model, declassification, complete lattice, type system. 3
  • 4. 4
  • 5. Resum´e Fejl som resulterer i informationseksponering kan i client-server baserede computerspil, som kommunikerer over et netværk, bruges til at snyde. Et programmeringssprog, der gør det muligt at specificere sikkerhed- sannotationer, kan designes specifikt for et givet spils domæne. Udfra det klassiske spil Sænke Slagskibe, er sproget gWhile blevet udviklet. gWhile tillader sikkerhedsannotationer som følger den decentraliserede label model. I sproget er der inkluderet kommunikationsprimitiver og kryptografi for at tilbyde sikker kommunikation, ligesom der er inkluderet andre primitiver specifikt til Sænke Slagskibe. Et typesystem, som kan verificere informationsstrømmen i et program i gWhile udfra den decentraliserede label model, er blevet designet. Ligedes er en simple analyse blevet udviklet. Type Matching Communications Analysis forsøger at matche kommunikationsprimitiver i et program. Nøgleord sikkerhed, sprog design, informations flow kontroller, den decen- traliserede label model, deklassificering, fuldstændigt gitter, type system. 5
  • 6. 6
  • 7. Preface This report describes and documents the M.Sc. Thesis Project of Jonas Rabbe. The project corresponds to 30 ECTS points and has been carried out in the period from October 2004 to March 2005 at the Technical University of Denmark, Department of Informatics and Mathematical Modeling, Com- puter Science and Engineering division under the supervision of Professor Flemming Nielson. I would like to thank Flemming Nielson for his great interest in my work, and his constructive criticism and inspiration during the project period. I would also like to thank my parents for proof-reading and commenting on the structure of the report. And thanks to the whole crew of room 008 for good company. Special thanks to my wife, Susan Rabbe, for support, understanding, and patience throughout this project, as well as proof-reading. Kgs. Lyngby, March 1, 2005 Jonas Rabbe, s991125 7
  • 8. 8
  • 9. Contents 1 Introduction 17 1.1 Problem Specification . . . . . . . . . . . . . . . . . . . . . 18 1.1.1 Final Specification . . . . . . . . . . . . . . . . . . . . . . 19 1.2 Battleships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.3 Structure of Report. . . . . . . . . . . . . . . . . . . . . . . 20 2 The Decentralized Label Model 21 2.1 Operators in the Decentralized Label Model . . . . . . 22 2.1.1 Restriction as a Partial Order . . . . . . . . . . . . . . . 24 2.1.2 A Lattice of Labels . . . . . . . . . . . . . . . . . . . . . . 25 2.2 Declassification . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.3 The Principal Hierarchy . . . . . . . . . . . . . . . . . . . 29 2.4 Implementations of the Decentralized Label Model. . 29 3 The gWhile Language 31 3.1 Syntax of the gWhile Language . . . . . . . . . . . . . . 31 3.2 Cryptography and Communication . . . . . . . . . . . . 34 3.2.1 Primitives for Asymmetric Cryptography. . . . . . . . . 35 3.2.2 Primitives for Symmetric Cryptography . . . . . . . . . 37 4 Type System and Analysis 39 4.1 Plain Type System . . . . . . . . . . . . . . . . . . . . . . . 39 4.1.1 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.1.2 Statements. . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.1.3 Initialization, Keys, Processes, Key Declarations, and System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 9
  • 10. 4.2 Type System for Security Annotations. . . . . . . . . . 46 4.2.1 The Block Label. . . . . . . . . . . . . . . . . . . . . . . . 47 4.2.2 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.3 Statements. . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2.4 Initialization, Keys, Processes, Key Declarations, and System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3 Type Matching Communications Analysis . . . . . . . 54 4.3.1 Matching Rules . . . . . . . . . . . . . . . . . . . . . . . . 55 5 Implementation 57 5.1 Parsing the gWhile Language. . . . . . . . . . . . . . . . 57 5.2 The gWhile Parse Tree . . . . . . . . . . . . . . . . . . . . 58 5.3 Type System Implementation . . . . . . . . . . . . . . . . 62 5.3.1 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.3.2 Label Implementation . . . . . . . . . . . . . . . . . . . . 63 5.3.3 Type Checking . . . . . . . . . . . . . . . . . . . . . . . . 65 5.3.4 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . 69 5.4 Type Matching Communications Analysis . . . . . . . 69 5.5 Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.6 Battleships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.6.1 Verification of Battleships. . . . . . . . . . . . . . . . . . 78 5.6.2 Introducing Leaks. . . . . . . . . . . . . . . . . . . . . . . 80 6 Discussion 83 6.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 A Initial Problem Specification 89 B User Guide 91 B.1 Installation Instructions . . . . . . . . . . . . . . . . . . . 91 B.2 Verifying gWhile Programs . . . . . . . . . . . . . . . . . 91 C Source Code 93 C.1 Parser, Typechecker, and Analysis . . . . . . . . . . . . 93 C.1.1 List of Files . . . . . . . . . . . . . . . . . . . . . . . . . . 93 C.2 server-based-battleships.w. . . . . . . . . . . . . . . . . . . 94 C.3 test.w . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 10
  • 11. D Test Results 105 D.1 Type System . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 D.2 TMCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Bibliography 111 11
  • 12. 12
  • 13. List of Tables 2.1 The simple rule for verifying declassification. . . . . . . 27 2.2 A more complex declassification type rule . . . . . . . . . 28 4.1 Plain type rules for the basis elements of expressions . 41 4.2 Plain type rules for the composite elements of expres- sions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.3 Plain type rules for simple statements . . . . . . . . . . . 42 4.4 Plain type rules for cryptographic and communicative statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.5 Plain type rules for the initialization . . . . . . . . . . . . 45 4.6 Plain type rules for the Asymmetric Keys. . . . . . . . . 45 4.7 Plain type rules for the Processes . . . . . . . . . . . . . . 45 4.8 Plain type rules for the Key Declarations . . . . . . . . . 45 4.9 Plain type rules for the System . . . . . . . . . . . . . . . . 46 4.10 Annotation type rules for expressions. . . . . . . . . . . . 48 4.11 Annotation type rules for basic statements . . . . . . . . 49 4.12 Annotation type rules for asymmetric cryptographic and communication statements . . . . . . . . . . . . . . . . . . . 50 4.13 Annotation type rules for symmetric cryptographic and communication statements . . . . . . . . . . . . . . . . . . . 51 4.14 Annotation type rules for the initialization . . . . . . . . 53 4.15 Annotation type rules for the Asymmetric Keys . . . . 53 4.16 Annotation type rule for the processes . . . . . . . . . . . 53 4.17 Annotation type rule for the key declarations . . . . . . 54 4.18 Annotation type rule for the system . . . . . . . . . . . . . 54 D.1 Test cases and results for type checker . . . . . . . . . . . 108 D.2 Test cases and results for TMCA . . . . . . . . . . . . . . 110 13
  • 14. 14
  • 15. List of Figures 3.1 Syntax of expression in the gWhile language. . . . . . . 32 3.2 Syntax of statements in the gWhile language . . . . . . 32 3.3 Syntax of remaining elements of the gWhile language 33 3.4 Syntax of expression based encryption . . . . . . . . . . . 36 3.5 Syntax of cryptography in communication . . . . . . . . . 36 3.6 Symmetric cryptography primitives . . . . . . . . . . . . . 37 3.7 Simple symmetric receive statement . . . . . . . . . . . . . 38 3.8 Symmetric key initialization . . . . . . . . . . . . . . . . . . 38 3.9 Symmetric key instantiation . . . . . . . . . . . . . . . . . . 38 3.10 Symmetric key declaration . . . . . . . . . . . . . . . . . . . 38 5.1 Datatype for labels. . . . . . . . . . . . . . . . . . . . . . . . . 59 5.2 Datatype for expressions . . . . . . . . . . . . . . . . . . . . 59 5.3 Datatype for statements . . . . . . . . . . . . . . . . . . . . . 60 5.4 Datatype for the initialization . . . . . . . . . . . . . . . . . 60 5.5 Datatype for the asymmetric keys . . . . . . . . . . . . . . 61 5.6 Datatype for processes, key declarations, and system . 61 5.7 The basic type datatype . . . . . . . . . . . . . . . . . . . . . 62 5.8 Using the fold functionality in the restriction operator 65 5.9 Format for the type function to check expressions . . . 65 5.10 Format for the type function to check expressions . . . 66 5.11 Format for the function which gets the type and label of an assignee . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.12 Using unzip and map to get the types and labels as separate lists. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.13 Format for the function which checks that labels are legal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.14 Checking the restriction condition of communication statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 15
  • 16. 5.15 Formats for the functions that alter γ . . . . . . . . . . . 68 5.16 Data types for fields in the TMCA tuple. . . . . . . . . . 70 5.17 Key declarations for the symmetric keys used in com- munication between A and the server . . . . . . . . . . . . 73 5.18 Initializations for process A . . . . . . . . . . . . . . . . . . 74 5.19 Targeting a pair of coordinates . . . . . . . . . . . . . . . . 75 5.20 Initializations for the server . . . . . . . . . . . . . . . . . . 77 5.21 Using random to select the starting player . . . . . . . . 77 5.22 Indicating turn of player with a boolean value . . . . . . 78 5.23 Acting for player 2 to declassify board value at the tar- get coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.24 Simple communication statement which is affected by implicit information flow control . . . . . . . . . . . . . . . 78 5.25 An example of illegal implicit flow in the server . . . . 79 6.1 Syntax of do not act for statement. . . . . . . . . . . . . . 86 16
  • 17. CHAPTER 1 Introduction More and more computer programs communicate over an insecure network such as the Internet. These programs rely on the safe and secure communi- cation of their data to function properly. This is for example the case with computer games where knowledge of data from the other players can be used to cheat. In the computer games industry cheating is serious business. If it comes out that cheating takes place in a game, players will quickly abandon it in favor of games where cheating is not prevalent yet. Cheating can be likened to doping in sports. As tournaments are held with large prizes this comparison becomes more and more apt. Cheating in a tournament could potentially cost a player the prize. Most computer games today use a client-server infrastructure, where one server runs games for a number of players. Each player has some secret information. In the case of a first person shooter this could be his location, health, ammunition, or other vital signs. In a game such as battleships, the secret information is simply the location of your ships on the playing field. In a computer game, one category of cheats concern the leakage of secret information, for example knowing the location of your enemy’s ships. This kind of information exposure is sometimes due to bugs in a game [Pri00]. The leakage of information can be prevented by annotating programs with information about how data may flow. One specific set of annotations is the Decentralized Label Model which has a well defined set of rules for how data flow is restricted. The Decen- tralized Label Model was introduced by Myers and Liskov [ML97], and has been applied to the programming language JIF [Mye99]. For a specific computer game a programming language can be designed 17
  • 18. with a domain specific to the game. This language can be designed with the annotations in place to allow the program to be annotated and the annota- tions to be verified. I have designed such a language for the game Battleships which allows annotations following the Decentralized Label Model. For this language I have designed a type system, both to check the plain types of a program and that the annotations hold. To facilitate the client-server architecture, the language was designed with communication statements, along with extensions to the Decentralized Label Model to accommodate the new statements. Some aspects of these statements can be checked with the type-system. However, to verify that a program does not contain a communication statement which will never be matched a simple static analysis has been designed. Both the type system and analysis have been implemented and used to verify that the problems which could occur in the Battleships program, as mentioned above, were discovered. 1.1 Problem Specification This project was started with initial problem specification shown in Ap- pendix A. However, it became clear the focus of the work in this thesis, while still centered on security annotations and their use in networked com- puter programs, would differ from the initial objectives. The specifics of communication attacks in particular would serve as a distraction from the focus of the project, namely the use of security annota- tions to restrict illegal information flow. Concentrating on the design aspects of the language and verification, as mentioned below, counteracts the neces- sity to discuss communication attacks beyond the most general description. Suffice to say, an implementation of the source code translation would have to this into consideration to ensure the secrecy of the communications. The design of the programming language, and especially the introduction of the communication primitives and their associated type rules, with respect to both the plain type system and security annotations, quickly proved more time consuming than anticipated. The source code translation envisaged in the original specification, an equally large undertaking, was phased out to make sure the language and type system design received the necessary attention. The cryptographic extensions to the language were intended for the spe- cific nature of networked computer games, but more generally link into net- worked programs, and are applicable to all programs employing the type of annotations used in this thesis. In the same fashion as the problem specification, the title of the project evolved as work progressed. The initial title was: Preventing Cheating in Computer Games through Realization of 18
  • 19. Security Annotated Code This title seemed in my mind quite focussed on the realization aspect of the initial problem specification. Similarly, information leaks in games can be used to cheat as described above, but the title made this work seem like a panacea in the field of computer games. The final title is more focussed on the containment of information leaks where not allowed, putting more focus on the networked aspects of the work in this project. 1.1.1 Final Specification Computer programs of today rely on data being protected, essentially in a strongbox, to prevent illicit information flow. Once a principal can read data, however, there are no restrictions on his distribution of it. This is especially the case with data communicated over an open medium. In computer games, cheats are available that rely on gaining access to information and distributing it to the cheater. For example exploiting a bug in the game to read the secret information of the other players. A programming language can be designed, with the program or game in mind, in which the legal flow of information can be specified in the source code. These annotations can be verified, using type systems and program analysis, to inspect the legality of the information flow. The language must contain cryptography and communication statements appropriate for the program it is designed for, to ensure the secrecy of the communications, and prevent information leaks. Other primitives specific to the game are also introduced. To motivate the development and provide a domain for the programming language I study an example program, namely the game Battleships. It is an example of a game played by two players over a network. Each player hides information from the other player–the location of his ships on the battle field. It is also a game in which a player will gain a large advantage by learning the hidden information for the other player. An extension of the While language, with parallelism, communication and various security mechanisms like access control annotations and cryp- tographic primitives, is studied. 1.2 Battleships The game Battleships is used as an example program in this thesis. It is a game known to most from its days as a board game. Battleships is played by two players that start by assigning each of their ships to a position on the playing field, while keeping these positions secret from their opponent. Once the ships have been placed each player take turns trying to shoot down the ships of their opponent. 19
  • 20. The playing field for Battleships consists of a grid where each grid inter- section can be addressed by a unique coordinate, normally in the form of a letter and a number. The player whose turn it is, announces a coordinate he is targeting; his opponent then announces if it was a splash, there was no ship occupying that grid intersection, or a kaboom, there was a ship at the target coordinates. There are a number of different variations on the rules at this point. If the shot resulted in a hit the player who shot either gets to go again, or the two players change turns. If the player misses he always looses the turn. The player who hits all the ships of their opponent first wins. In most versions of the game there are a number of different ships having different sizes, for example the carrier could be five sections long, while the destroyer could be three sections in size. The main element of the game is secrecy. Each player tries to hide his ships from the opponent, but at the same time he must be truthful if his opponent hits one of his ships. 1.3 Structure of Report Chapter 2 discusses a specific set of security annotations, namely the De- centralized Label Model. This set of annotations are used in the pro- gramming language which is designed, and its use and properties must be explained. Chapter 3 presents the language which is designed. Describes both the syntax and specific thoughts, especially behind the cryptography and communication primitives. Chapter 4 introduces the type systems used for verifying programs in the programming language. Also shows a simple analysis which attempts to match communications between processes. Chapter 5 describes the specifics of the implementation. In the course of this project a parser, a type system, and an analysis have been implemented. Chapter 6 summarizes the results of the work in this thesis, and shows a number of future directions the results could be used on. 20
  • 21. CHAPTER 2 The Decentralized Label Model This chapter describes the Decentralized Label Model which is one example of security annotations in practice. First the concept of labels, the central concept in the Decentralized Label Model, is described. This is followed by a description on the operators that work on labels in Section 2.1. Special properties of the operators, specifically the restriction operator as a partial order, and the lattice property of the set of labels is also described here. Section 2.2 speaks about declassification, the notion that values can flow against restrictions. Authority is discussed in Section 2.3. Finally, JIF is very briefly described, along with some thoughts on implementations of the Decentralized Label Model. The Decentralized Label Model is a model for specifying end-to-end con- fidentiality policies. The authors of the Decentralized Label Model, An- drew C. Myers and Barbara Liskov, saw a number of inadequacies in the way access control currently works. The most common form of access con- trol is based on access control lists which specify who can read and write data. This works well for preventing illicit access to data, but once data has been read, there is no limitations on how it can spread. There exist some models that allow end-to-end security policies, but according to Myers and Liskov [ML97, ML00] these cannot readily be put into practice. The Decentralized Label Model is an attempt at making an access control model which contains end-to-end security policies while enabling its use in practical systems. Myers has developed a language [Mye99] based on Java and the Java Virtual Machine which enables the use of the Decentralized Label Model. The language is introduced as JFlow, but has later been renamed JIF which is an abbreviation of Java Information Flow. The Decentralized Label Model is based on the labeling of variables to 21
  • 22. specify how their values may flow. The value of a variable may flow into another variable if the flow constitutes a restriction as described below. A label is a set of owners and for each owner a set of readers. The syntax for a label has been defined as {O1 : −→ R1; . . . ; On : −→ Rn} where −→ Ri is a comma-separated list of readers. Both the O and R of owners and readers refer to principals of the program. In literature the principals are normally named with one letter. For two principals talking to each other the letters A and B are used. If a server is involved it usually has the name S. For a label, owner set means the set of all owners of the label, while reader set refers to the set of all readers for a given owner. For example the reader set of a label for the owner A. Another concept is the effective reader set of a label which is the readers all the owners can agree upon to read the data. The effective reader set is the intersection of all the reader sets for the label. An owner should only occur once in the owner set of a label. Likewise a reader should be unique to the reader set for a given owner. For a label {A : B; B : A; A : C} this would mean that the second instance of A would be ignored and it would be equal to the label {A : B; B : A}. Normally labels will be short enough that it is not a difficult task to ensure that an owner does not occur more than once. 2.1 Operators in the Decentralized Label Model Having introduced labels and the basic concepts related to labels, the mech- anisms for working on them can be introduced. In the Decentralized Label Model variables are annotated to specify their labels, but to work with the labels specified some basic operators must be defined. In the model there are two operators that work on labels: The restriction operator, , and the join operator, . The restriction operator is a relation on labels which is true if the first label is less restrictive than the second. The notation “the second label is a restriction on the first” is also used. The join operator is used to combine two labels. The definitions of the restriction and join operators depend on two func- tions: owners(L) and readers(L, O), these functions are analogous to the owner set for the label and the reader set for the owner O of the label. The function owners(L) returns a set containing the principals that are owners of the label L. If L = {A : A, B, C; C : C, B} 22
  • 23. then owners(L) = {A, C} Similarly, readers(L, O) returns a set containing the principals that are readers for a given owner, O, in L. Using the same L as above: readers(L, A) = {A, B, C} An owner is implicitly a reader in his own reader set. For a label {O : −→ R} this means that the set for readers({O : −→ R}, O) is { −→ R} ∪ {O}. In the example with L above this is not seen since A is already a member of his reader set, but if L is {A : B, C} the result is still the same: readers(L, A) = {B, C} ∪ {A} = {A, B, C} Another case worth examining is the result for readers(L, B) where B /∈ owners(L). Since B is not an owner of the label, he does not impose any restriction on the propagation of data, and his reader set is simply the set of all principals. The two operators are in [ML97] defined as: Definition of L1 L2 owners(L1) ⊆ owners(L2) ∀O ∈ owners(L1), readers(L1, O) ⊇ readers(L2, O) where L1 is less restrictive than L2, or L2 is a restriction on L1. Notice that the condition on readers is the inverse of the condition for owners. And Definition of L1 L2 owners(L1 L2) = owners(L1) ∪ owners(L2) readers(L1 L2, O) = readers(L1, O) ∩ readers(L2, O) The join results in the least restrictive label which is at least as restrictive as both L1 and L2. Since the reader set for an owner, which is not in the label, is simply all the principals, the join of two labels where one contains an owner not in the other would result in the owner and his readers from the first label being inserted into the resulting label. For example: {A : B; B : A, C} {A : B} = {A : B; B : A, C} This also means that the join of a label L and the empty label is just L, since the empty label imposes no restrictions. In addition to the two operators described above, the notion of two labels being equal is also needed. The equality relation on two labels is defined as: 23
  • 24. Definition of L1 = L2 owners(L1) = owners(L2) ∀O ∈ owners(L1), readers(L1, O) = readers(L2, O) 2.1.1 Restriction as a Partial Order It is interesting to note that the restriction operator, , on the set of all labels, SL, is a partial order. This is proven by the following three properties given the labels L1, L2, and L3: 1. L1 L1, reflexivity 2. L1 L2 ∧ L2 L3 ⇒ L1 L3, transitivity 3. L1 L2 ∧ L2 L1 ⇒ L1 = L2, anti-symmetry Reflexivity holds immediately since a set is always a subset of itself: owners(L1) ⊆ owners(L1) Similarly, for the readers ∀O ∈ owners(L1), readers(L1, O) ⊇ readers(L1, O) Transitivity follows in a similar fashion, from the transitive property of the ⊆ operator on sets. owners(L1) ⊆ owners(L2) ∧ owners(L2) ⊆ owners(L3) ⇒ owners(L1) ⊆ owners(L3) In the same fashion as the reflexivity property, the second condition of the restriction operator can be shown: ∀O ∈ owners(L1), readers(L1, O) ⊇ readers(L2, O) ∧ ∀O ∈ owners(L2), readers(L2, O) ⊇ readers(L3, O) ⇒ ∀O ∈ owners(L1), readers(L1, O) ⊇ readers(L3, O) Also from the transitivity property of the ⊇ operator on sets, and the knowledge from condition one. 24
  • 25. Anti-symmetry is again shown by looking at the conditions of the re- striction operator. For the first condition this is owners(L1) ⊆ owners(L2) ∧ owners(L2) ⊆ owners(L1) ⇒ owners(L1) = owners(L2) which holds immediately. Similarly for the second condition: ∀O ∈ owners(L1), readers(L1, O) ⊇ readers(L2, O) ∧ ∀O ∈ owners(L2), readers(L2, O) ⊇ readers(L1, O) ⇒ ∀O ∈ owners(L1), readers(L1, O) = readers(L2, O) 2.1.2 A Lattice of Labels Next it is shown that the partial order admits binary least upper bounds and that they are given by the formula for . It is clear that Lx Ly finds an upper bound of the set {Lx, Ly} since the operator adds owners and removes readers when there is a coincidence of owners. If you remember the condition for the restriction operator, which is our partial order, more owners and fewer readers for the owners in the less restrictive label constitutes a restriction. The join operator yields a label which is more restrictive than each of its operands, and is therefore an upper bound. Of interest here, however, is the least upper bound, or in label terminology: The least restrictive of all labels that are more restrictive than the operands. This is proven by contradiction. Imagine that the label found by Lx Ly is not the least upper bound of the set {Lx, Ly}. That would mean that there have to exist a label, L, which is an upper bound for {Lx, Ly}, more restrictive than both while being less restrictive than Lx Ly. L would have to contain all the owners from Lx as well as all the owners from Ly in order to fulfill condition one of the restriction operator. Similarly, for each owner in L there could be no more readers than for that owner in Lx or Ly. Since Lx Ly is found by the union of the owners, and for each owner the intersection of the readers, the label L is equivalent to Lx Ly and there cannot exist a label which is more restrictive than Lx and Ly, but is less restrictive than Lx Ly. The binary join operator clearly finds the least upper bound of its operands. In [DP90] it is shown that there are a number properties associated with a binary join operator, most notably commutativity and transitivity. This means that S can be used to denote the join of all the elements of the finite and non-empty set S. If S is the set {Lx, Ly, Lz} then S = Lx Ly Lz 25
  • 26. It is clear from the proof of the binary join operator above and the associated traits that S finds the least upper bound of S. However, the special case ∅ remains. The join of the empty set simply yields the empty label. Remember that the empty label is the least restric- tive of all labels since it allows information to flow anywhere. The empty label is the least or bottom element of the lattice, and is denoted by the ⊥ symbol. Hence the join of a finite (and possibly empty) set S = {L1, . . . Ln} (for n ≥ 0) is given by S = ⊥ L1 . . . Ln Given a program a finite set of principals and hence a finite set SL of labels can be arranged for. The join of the whole set, SL, therefore has been catered for and in fact yields the top element, the label which contains all principals of the program as owners and no specified readers for each owner. Data with this label is owned by everyone and can flow nowhere. It is the most restrictive label and has the symbol . At this point it has been shown that the finite possibly empty set SL is a partially ordered set with the restriction ordering, , and that every subset, S⊆, of SL has a least upper bound, S⊆. Lemma A.2 of [NNH99, page 392] says: Lemma A.2 For a partially ordered set SL = (SL, ) the claims 1. SL is a complete lattice, 2. every subset of SL has a least upper bound, and 3. every subset of SL has a greatest lower bound are equivalent. Therefore SL is a complete lattice. Some lattice properties of the set of labels have been described previ- ously by Myers and Liskov, through reference to the work of Denning and the notion of a security-class lattice [Den76]. The concept of a complete lattice, however, has not been applied to the set of labels in available liter- ature. One of the conditions of labels which should be noted again in this connection, is the prohibition of redundancy in labels. Redundancy is for example repetition of owners in a label or readers for a given owner, and would allow for labels that do not follow the anti-symmetry condition on the partial order. In the definition Myers and Liskov gives of labels there is no prohibition on redundancy [ML97]. 26
  • 27. 2.2 Declassification As mentioned above, the Decentralized Label Model allows data to flow as long as it becomes more restrictive. This works well for restricting the flow of data and preventing outsiders from learning your data. Sometimes, however, you may need to let an outsider learn something about your data. In following with the example program, Battleships, you have to let your opponent know if there is a ship at the coordinates he targeted. If you are only allowed restrictive flow you cannot do this unless he is allowed to learn all the values of your board, something which is quite undesirable. What is missing is declassification [ZM01]. Declassification allows an owner to modify the flow policy for his data. In the Decentralized Label Model this modification can either be the addition of one or more readers for the owner, or removal of the owner and his readers. Data is, in the gWhile language as defined in Section 3.1, declassified using the declassify construct. This function takes an expression and a label and attempts to put the label on the data of the expression. In Table 4.10 of Section 4.2 the type rule used for verifying the declassification expression is shown. In its simplest form as shown in Table 2.1 a label, LA, is constructed from the current principal, ρ in the rule. This label is used in together with the label, L, which the data should have after the declassify expression. The rule simply checks that the label of the expression, Le, is less restrictive than L joined with LA. ρ; λ e : Le LA = {ρ : ∅} Le L LA ρ; λ declassify(e, L) : L Table 2.1: The simple rule for verifying declassification To follow the example above, imagine that a player, player 1, receives a pair of coordinates from an opponent, player 2, in a game of Battleships. The principal representing player 1 is denoted by A, while player 2 is represented by the principal B. The board of player 1 has the label {A : ∅} and is therefore very restricted, only player 1 may read it. Player 1 has to send a reply to player 2 to show if there is a ship at the given coordinates or not. The way to do that is to declassify the value returned from accessing the board to allow the opponent to read it. This could either be done by relabeling it to {A : B} or simply {}. Since there is only one opponent in a Battleships game there is no danger in allowing him to do what he wants with the data, and the board value is simply relabeled to the empty label using the following statement: boardValue := declassify(board[x][y], {}) 27
  • 28. Since the current process is the process of player 1, ρ has the value A and the type rule for the declassification is verified as follows: Le L LA ⇔ {A : ∅} {} {A : ∅} ⇔ {A : ∅} {A : ∅} In the gWhile language, however, it is not always as simple as using the current principal. Our data may reside on a server which under certain circumstances has the authority to act for us. If, for example, our game board is on a server which controls the logic of the game, the server must, upon receiving the target coordinates from our opponent, be allowed to declassify the board value for the coordinates. Since the current principal of the server is S it cannot declassify using the simple rule above, a more complex rule is called for, the rule that is shown in Table 2.2. ρ; λ e : Le LA = {A : ∅|A ∈ ρ} Le L LA ρ; λ declassify(e, L) : L Table 2.2: A more complex declassification type rule This is the same rule which is shown in Table 4.10 of Section 4.2. Here ρ does not simply contain the current principal, but can be augmented using the special actfor construct described in Section 3.2.2 and is a set of principals the current process can act for. Since a process can always act for itself, ρ will for the server always contain S. If the server can currently act for us, ρ is extended to contain both S, and A, and the label LA becomes {A : ∅; S : ∅} since each principal in ρ is included as an owner with no readers in the label. The declassification example shown above then has the type verification: Le L LA ⇔ {A : ∅} {} {A : ∅; S : ∅} ⇔ {A : ∅} {A : ∅; S : ∅} Outside the blocks where the server can act for us, ρ simply contains S and the declassification cannot take place Le L LA ⇔ {A : ∅} {} {S : ∅} ⇔ {A : ∅} {S : ∅} since {A : ∅} {S : ∅} does not hold. 28
  • 29. 2.3 The Principal Hierarchy Above in Section 2.2 the hierarchy of principals was briefly touched upon through the notion of principals acting for each other. In the JIF language there is a distinction between the static principal hierarchy and run-time principals [Mye99, ML00]. The static hierarchy is used in the static analysis of programs, while principals and labels can also be used as values at run-time and complicate matters further. The use of labels and principals at run-time mean that there are aspects of the principal hierarchy that may change during the execution of a program, including which principals a principal may act for. The only mention of how this dynamic hierarchy is maintained I could find is in [ML97, page 5] where it says: “The right of one principal to act for another is recorded in a database.” The gWhile language, outlined in Section 3.1, is only verified statically. The loss in versatility is outweighed by the added simplicity in the verifica- tion. Of further interest, however, is that the authority of a principal is not propagated through a convoluted hierarchy, but is entirely derived during the verification of a program. One of the corner stones in the gWhile language is the notion of commu- nication. As concluded in Section 3.2 the communication will be combined with cryptography to allow for secure communication. Both asymmetric and symmetric cryptography will be used in similar but different communi- cation primitives. Asymmetric cryptography does not say anything about the person who encrypted and sent a message since the aptly named public key is publicly available. In symmetric cryptography, on the other hand, part of the security of the communication lies in the fact that only the two parties that communicate know the key. This has another profound impact; if you are told the key it says something about your authority. In the gWhile language I have let the ability to decrypt a message sent using a symmetric key be synonymous with the authority to act for the owners of the key. The owners of the key are the owners of the label for the package that is sent. In most cases this label will only have one owner and let everyone read it, there is, in other words, normally no limits to how an encrypted package may flow. 2.4 Implementations of the Decentralized Label Model The only implementation at present of the Decentralized Label Model, with regards to a programming language, is JIF and the run-time system Myers has built for it. The Decentralized Label Model is limited, as are all source code security verifications, by the quality of the runnable code, and the environment it is 29
  • 30. run in. Once the program has been compiled into machine code there is no room for annotations. There are two problems associated with this. One is the fact that a game can be altered by patches to perform illegal flow in an uncontrolled environment. The second is the situation where an end-user uses a program which breaks its promise to handle his information properly. The second case has been the focus of most of the literature involving the Decentralized Label Model. Lately, however, the use of the Decentralized Label Model to prevent accidental information leaks has increased [ML00] compared to the earlier discussions [ML97, ML98]. In this case, the JIF run-time implementation allows insurance that the information flow policies of the program are enforced, making sure that the information of the user is not leaked without his knowledge. Myers has chosen to construct his own environment in which to run the compiled code. The JIF run-time system which is based on the Java Virtual Machine (JVM) and ensures that the information flow control of the Decentralized Label Model is observed. Another way to ensure the enforcement of source code security verifica- tions is through a reference monitor [SMH01]. In this case, however, the reference monitor must know the original source code and how the source code would make the program behave, or a verification result of the pro- gram saying what the program should do. If the program behaves outside the normal parameters as described by the source code or verification result, the monitor can restrict or terminate it. The problem with both of these approaches is that it requires the user to obtain a module, the run-time system or reference monitor, which they have to trust. Case number one is probably more applicable to the example of a com- puter game. Players trust the manufacturer enough to believe they will not intentionally allow people to cheat. The problem is more the imagined inap- titude of the developer. Most often, bugs which allow malicious players to hack, patch, or otherwise intercept data between the server and their own version of the game and cheat that way, are the problem. While a program running on an untrusted platform, which is what the game is on the mali- cious player’s machine, cannot be trusted, the verification of the code that runs on the server prevents information leaks that are not consciously put in. A further problem with regards to computer games is the issue of perfor- mance. In computer games there is an ever-present quest for getting the best graphics and fastest code on existing hardware. Using interpreted byte-code as the JIF run-time does, or a reference monitor which verifies each piece of object code before it is run, does not mesh well with this pursuit. 30
  • 31. CHAPTER 3 The gWhile Language To allow annotations of the example program, Battleships, as described in the previous chapter on the Decentralized Label Model, a language had to be designed. This chapter describes the design of the language and the thought behind the communication. The name of the language, gWhile, is short for game While from its use in a game, and its inheritance from the While ˆElanguage. First the syntax of the language is shown, along with a brief explana- tion of the elements. Since secure communication is a corner stone of a networked version of Battleships, Section 3.2 is devoted to the discussion of the cryptography and communication statements of the language. 3.1 Syntax of the gWhile Language The syntax of the language designed for this thesis is shown in Figure 3.1, Figure 3.2, and Figure 3.3. This language is called the gWhile language, and is based on the While language introduced in [NN92]. There are, however, a number of changes in the gWhile language to make it more suitable for the example program, and to incorporate the annotations of the Decentralized Label Model. The JIF implementation of the Decentralized Label Model uses the no- tion of labeling of variables. In JIF this is done using a syntax in which a variable is specifically initialized with a type, value, and label. i: int{A: A, B} := 0 The labeling of variables is inspired by the syntax of JIF, as is evident in the initializations of Figure 3.3. Types are in gWhile inferred from the 31
  • 32. initial values to simplify the initializations compared with JIF. In the definition of the language some notations are used for variables, numbers, and principals which are defined as x, k, k+, k−, d ∈ Var n ∈ Num A ∈ Princ where Var are variables, Num are numeric values, and Princ are principals. Similar notations for expression, statements, and so forth are defined for their relevant components below. e ∈ Expr e ::= n | x | this | ’A’ | x[e1][e2] | true | false | random(e) | declassify(e, L) | e1 + e2 | e1 = e2 | e1 < e2 | not e Figure 3.1: Syntax of expression in the gWhile language Expressions, as shown in Figure 3.1, have seen the addition of the this keyword which refers to the principal of the current process, a notation for principals, the table or two dimensional array used for the playing field of each player, and two functions: random and declassify. random is a weak random number generator used in the generation of the playing field for each player, it returns a number between 1 and the numerical value of the expression it is called with. declassify is used in the Decentralized Label Model as discussed in Section 2.2. A more fundamental change is the combination of arithmetic and boolean expressions into the expression type. This was done to achieve greater free- dom in the handling of expressions, for instance in the communication. A consequence of this is the necessity to instate a type system for the basic types of programs in gWhile shown in Section 4.1. S ∈ Stmt S ::= x := e | x[e1][e2] := e3 | skip | S1; S2 | if e then S1 else S2 endif | while e do S endwhile | asend(e1, . . . , en){k+ } | areceive(e1, . . . , ej; xj+1, . . . , xn){k− } | ssend(e1, . . . , ek){k} | sreceive(e1, . . . , ej; xj+1, . . . , xk){k} andactfor A in S endactfor | ssreceive(e1, . . . , ej; xj+1, . . . , xk){k} | instantiate k Figure 3.2: Syntax of statements in the gWhile language Additionally, the language also includes communication and cryptogra- phy primitives for the processes to communicate and do so securely. These 32
  • 33. are shown above–mainly in Figure 3.2–but are discussed in Section 3.2. L ∈ Label L ::= A : | A : R1, . . . , Rn | A : all | | L; L i ∈ Init i ::= x{L} := n | x{L} := ‘A’ | x{L} := true | x{L} := false | x[n1][n2]{L} | key k{L} using d | i, i AK ∈ Asymmetric Keys AK ::= k(d)+ | k(d)− | AK, AK P ∈ Proc P ::= A[AK] : (i){S} | P P KD ∈ Key Declarations KD ::= declare d as {T1{L}, . . . , Tn{L}}{L} | KD; KD Sys ∈ System Sys ::= [KD]P Figure 3.3: Syntax of remaining elements of the gWhile language The gWhile language has no dynamic memory allocation and all vari- ables for each process must be declared in the initialization. The types of the variables are also determined from this initialization and follow the variables all through the program. Worth noticing is the initialization of table vari- ables. A table can only contain integer values and is initialized with the size of the table. In the initialization all the cells of the table are automatically set to zero. One of the initializations is the initialization of symmetric keys with the key k using d construct. The key in question is defined, but not instan- tiated, this means that using the key without instantiating it first would result in an error. Symmetric keys are initialized from a key declaration. A key declaration is a signature for the fields that can be sent using keys associated with it, for each field the type and label must be specified. Using, declaring, and instantiating symmetric keys is also discussed in Section 3.2. Asymmetric keys are declared from the beginning of each process. The idea being that a client has the public keys of the server and uses these to communicate symmetric keys to the server which are then used to commu- nicate the data. In the same fashion as the symmetric keys, an asymmetric key is defined with respect to a key declaration. This is done in the header of the process using the k(d)+ syntax. The example would declare a public key with the identifier k using the key declaration d. The name of the key would be k+. 33
  • 34. 3.2 Cryptography and Communication In this section I will explain the cryptography and communication primitives in the gWhile language, as well as the reasoning behind them. In JIF, communication is performed using channels. [ML00] describes channels as half-variables; they have associated labels in the same fashion as variables, but only allow either input or output. The rules for reading from and writing to channels are the same as for reading from or writing to variables. Channels differ further from the communication primitives normally found in network programming in that a channel is not only a way for two computers to communicate, it is also a way for a computer to communicate with its display, attached printer, even the keyboard. While channels may work well for simple input and output of data, an- other layer of abstraction is desirable. It may be desired to send a message with several values over an open network. Communication over an open network, however, has a further worry attached. Since the network is open, any data transmitted over it can be read by anyone. This opens the door for cryptography to ensure the secrecy of the communication. In communication secured by cryptography a number of conditions must be in place to ward off attacks. Cryptographic protocols are usually validated with respect to three properties: Authenticity That each principal of the communication can be sure the other principal is who he claims to be. Confidentiality That information communicated cannot be read by a third party. Integrity That data cannot be altered in the process of the communication. These properties are requirements for an implementation allowing the execution of programs in gWhile, but will not be considered in the analy- sis and verification of gWhile programs. It is assumed that an interpreter that implements the communication primitives would take into account the above properties to prevent attacks, allowing this discussion to focus on the primitives and their effects on the Decentralized Label Model. The language contains both symmetric and asymmetric cryptography. The idea is that each copy of the game knows the public keys of the server, these keys are then used to communicate or negotiate one or more symmetric keys which can be used for the game specific communication. Using symmetric cryptography alone is not feasible, since a unique set of keys for each player would be needed. While this is not a problem for the players, the number of keys that would have to be known beforehand by the server is immense. Relying solely on asymmetric cryptography, however, is not an option either. As described in Section 2.3 authority is connected 34
  • 35. with the symmetric cryptography. This authority is not readily replicable in the realm of asymmetric cryptography since the public keys of the server are available to all players of the game. The mixture of asymmetric and symmetric cryptography allow us to initiate the communication through the asymmetric cryptography, and use the symmetric cryptography to instill the notion of authority. In the design of the language I have regarded two different approaches. The first was expression based encryption while the second was encryption built into the communication. Though the discussion in Section 3.2.1 is centered around asymmetric cryptography, much of the thought behind it is also applicable to symmetric cryptography as discussed in Section 3.2.2. The communication statements are assumed to be synchronous, this means that both the sender and receiver halts until the communication has taken place. 3.2.1 Primitives for Asymmetric Cryptography In the discussion on asymmetric cryptography ak is used to denote an asym- metric key. This could be either a public or private key which has been shown as k+ and k− previously. Expression Based Cryptography has two separate parts. The encryp- tion takes a number of expressions and encrypts them into an encrypted package, as shown in Figure 3.4. The resulting package can be passed around in the same fashion as other expressions, provided it is typed correctly. An encrypted package can be decrypted using the decrypt statement. In the decryption pattern matching can take place. Pattern matching means that the encrypted package is decrypted, but its values are only assigned if its contents match the pattern of the decrypt statement. A pattern consists of a number of expressions to match, then a semicolon, followed by a number of variables to write the remaining values into. The first values are compared to the result of the expression, the pattern matches only if these values are equal. In both approaches, as shown in Figure 3.4 and Figure 3.5, the key, ak, can be either a public or a private key. Using a public key ensures confidentiality, the package is encrypted; the use of a private key ensures authenticity, the package is signed. Since an encrypted package can be sent around, any of the expressions encrypted in the expression based encryption can be another encrypted pack- age. This means that parts, or all, of a package can be signed while still ensuring confidentiality–a package can both be signed and encrypted. Cryptography in Communication considers cryptography built into the communication primitives of the language, these are shown in Figure 3.5. 35
  • 36. e ∈ Expr e ::= {e1, . . . , en}{ak} S ∈ Stmt S ::= send(e1, . . . , en) | receive(n1, . . . , nj; xj+1, . . . , xn) | decrypt x as {n1, . . . , nj; xj+1, . . . , xn}{ak} Figure 3.4: Syntax of expression based encryption This means that all communication is encrypted and allows for matching of encrypted values in the receive statement itself. A receive statement will only accept a package if it can be decrypted and matches the specified pat- tern. S ∈ Stmt S ::= asend(e1, . . . , en){ak} | areceive(e1, . . . , ej; xj+1, . . . , xn){ak} Figure 3.5: Syntax of cryptography in communication Both approaches have a number of advantages and disadvantages. Expression Based Cryptography • Advantages – No more of the message than what is strictly necessary has to be encrypted. – Since not everything that is sent has to be encrypted there is less of a burden on the processor. – Messages can both be signed and encrypted to ensure both au- thenticity and confidentiality. • Disadvantages – Which parts of the message that must be encrypted and which ones do not must be taken into consideration. – Compared to Cryptography built into the Communication an ex- tra variable has to be used to hold the received package, before it can be decrypted. – The type system for expressions would need an additional type for encrypted packages. 36
  • 37. Cryptography in Communication • Advantages – Everything sent is either encrypted or signed, so there is no need to think about which parts to encrypt. – All communication using the public key of the recipient is confi- dential. • Disadvantages – Since everything that is sent is also encrypted it can be quite processing intensive. – A message cannot both be encrypted and signed. Since asymmetric cryptography is only used in the communication of symmetric keys to the server, the simplicity of the second approach made it an easy choice. This is also evident from its inclusion in the gWhile syntax shown in Section 3.1. The syntax included in the gWhile language does not provide signed messages, but only allow a public key to be used in the asend statement, and a private key in the areceive statement. Choosing this approach leads to an augmentation to the filesystem shown in Table 4.4. 3.2.2 Primitives for Symmetric Cryptography In the following the name sk is used to denote a symmetric key. S ∈ Stmt S ::= ssend(e1, . . . , ek){sk} | sreceive(e1, . . . , ej; xj+1, . . . , xk){sk} andactfor A in S endactfor Figure 3.6: Symmetric cryptography primitives As mentioned above, the thought process for deciding the approach to asymmetric cryptography was largely relevant for symmetric cryptography as well. The simplicity of cryptography built into the communication primi- tives weighs more heavily than its drawbacks. An additional thought to take into consideration for symmetric cryptography, however, is the notion that the ability to decrypt something which has been encrypted with a symmetric key says something about your authority with respect to the owner of the key. Combining the symmetric receive primitive with the notion of authority as described in Section 2.3 yields the primitives shown in Figure 3.6. 37
  • 38. S ∈ Stmt S ::= ssreceive(e1, . . . , ej; xj+1, . . . , xk){sk} Figure 3.7: Simple symmetric receive statement Since the case where a principal receiving a package from someone else does not want to act for that principal exists, a simple receive statement shown in Figure 3.7 is also given. i ∈ Init i ::= key sk using d Figure 3.8: Symmetric key initialization S ∈ Stmt S ::= instantiate sk Figure 3.9: Symmetric key instantiation KD ∈ Key Declarations KD ::= declare d as {T1{L}, . . . , Tn{L}}{L} | KD; KD Figure 3.10: Symmetric key declaration In these constructs the key, sk, is a symmetric key. A symmetric key is defined in the initialization of the process, shown in Figure 3.8, but is not instantiated until a session key is created using the instantiate statement shown in Figure 3.9. If instantiate is called on a symmetric key which has already been instantiated, a new instance of the key is generated. This is useful to prevent some replay attacks since each message sent would be sent with a new key. Of course there may be some added problems with the distribution of the new key and the processing power used for generating it. The symmetric key, sk, is initialized from the declared key format, d. Key formats are declared in the Key Declarations header of the program, using the declare block as shown in Figure 3.10. A key declaration is declared with a number of fields to be sent. For each field the type of the field and its label is specified. The label for the encrypted package sent on the network must also be specified. A symmetric key can from such a declaration be thought of as a trans- formation function from a set of fields each with a label to a single encrypted field with a specific label, and vice versa. 38
  • 39. CHAPTER 4 Type System and Analysis With some familiarity with both the Decentralized Label Model as a model for using security annotations of a language, and the syntax of a language, the gWhile language, which allows annotations to be specified at the source level, the next step is to look at verifications of the model and language. As mentioned in Section 3.1 the combination of arithmetic and boolean expressions into the Expr type meant that programs had to be typed for basic type conformance. Furthermore, the verification of the security annotations can also be performed by a type system [VSI96, VS97, ML97]. To verify both the types in programs, and the security annotations, two type systems have been designed. Section 4.1 describes the so-called plain type system which checks the basic types of expressions. The unique features of the gWhile language with respect the Decentralized Label Model, however, are discussed in the annotation type system of Section 4.2. A simple analysis of the communication is shown and described in Sec- tion 4.3. 4.1 Plain Type System Two type systems have been designed. One for checking the basic types of a program, the second for checking the program with regards to the Decen- tralized Label Model. This sections discusses the plain type system used for checking the basic types. The type system for the security annotations of the Decentralized Label Model is described in Section 4.2. In the type system there is the notion of types. The basic types are given by 39
  • 40. τ ∈ Basic Type τ ::= int | bool | principal | int × int → int | τ1 × . . . × τn | τ1 × . . . × τn → crypt | τ1 × . . . × τn → encrypt | τ1 × . . . × τn → decrypt while the large types are given by T ∈ Large Type T ::= stm | proc | sys The basic types are used by expressions, variables, keys, and key dec- larations while the large types are used by statements, processes, and the system. The basic types are mostly self-explanatory with the int × int → int type denoting the type for an integer table. The table can be thought of as a function which accept two expressions that evaluate to integers and return an integer. Also worth noting is the type for a key declaration, τ1 × . . . × τn, where each type, τi, matches the ith type specified for the key format. A symmetric key is associated with a key declaration, this means that it can only be used in sending and receiving messages that are in the format specified by the key declaration it is associated with. Symmetric keys have a format which is similar to the format for the integer table, except they use the key declaration as the fields and return a crypt field. An asymmetric key is also associated with a key declaration in much the same way as a symmetric key. The format is the same too, but using either the encrypt type for public keys, or the decrypt type for private keys. The large types are returned by the type rules for statements, processes, and the system to indicate that the rules type. Common to all the type rules is the function γ : Var → τ This function is the type environment, or variable map, for the type system. It maps each variable to its type, as defined by its initialization, as well as the key declarations and the asymmetric keys declared for the process. The domain of the type environment, dom(γ), is {x|γ contains [x → · · · ]}. Furthermore, γ(x) = τ can be written if x ∈ dom(γ) and the occurrence of x in γ is [x → τ]. The three type environments from the key declarations, asymmetric keys, and initialization are combined using the combination, or ∨, operator. This operator creates a map which for each of the inputs to the previous maps still yield the values, for example γ = [x → τ ] ∨ [y → τ ] would yield γ(x) = τ and γ(y) = τ . If two maps are combined which contain the same variable 40
  • 41. name, for example γ = [x → v ] ∨ [x → v ], then it results in an error. In the type system this is modeled by dom([x → v ]) ∩ dom([x → v ]) = ∅ The intersection of the domains will only be non-empty and result in an error if a variable is defined multiple times. Using the intersection of the domains as an implicit condition on the combination operator, γ ∨ γ is sufficient for the combination γ ∨ γ where dom(γ ) ∩ dom(γ ) = ∅. 4.1.1 Expressions (int) γ n : int (var) γ x : τ if γ(x) = τ (this) γ this : principal (princ) γ ‘A’ : principal (true) γ true : bool (false) γ false : bool (table) γ e1 : τ1 γ e2 : τ2 γ(x) = τ1 × τ2 → τ γ x[e1][e2] : τ Table 4.1: Plain type rules for the basis elements of expressions (rand) γ e : int γ random(e) : int (decl) γ e : τ γ declassify(e, L) : τ (eq) γ e1 : int γ e2 : int γ e1 = e2 : bool (lt) γ e1 : int γ e2 : int γ e1 < e2 : bool (add) γ e1 : int γ e2 : int γ e1 + e2 : int (not) γ e : bool γ not e : bool Table 4.2: Plain type rules for the composite elements of expressions The type rules for expressions shown in Table 4.1 and Table 4.2 are quite straightforward. The basis elements of the language simply return 41
  • 42. their specific type. In the case of variables, (var), this is done by fetching the type from γ. For (table) the type is found in γ and compared to the type for each of the indexing expressions to return the final type. Although the type for a table is int × int → int there is no mention of int in the type rule. This is because the table is initialized into γ using int × int → int which allows us to simply check that the types are the same. The (decl) rule does nothing in the plain type system, as the declassify construct only has an effect in the annotation type system and the type of the expression is simply passed on. The constant rules (int), (this), (princ), (true), and (false) all simply return their appropriate types. The binary operator rules (eq), (lt), and (add) check that the operands have the appropriate types and return the type of the operator. The monadic operator, (not), and remaining function, (rand), check the type of the operand and return the appropriate type. 4.1.2 Statements For statements, only one simple type is used, the large type stm. The co- herence in the statements must be checked, but only to ensure that they type. Table 4.3 shows the type rules for the simple statements which, with the exception of the tabular assignment, are also present in the normal while language. The assignment rules (ass) and (tass) simply type the left side of the statement and the right hand side, and check that the types match. The (skip) rule always type, while the sequence rule, (seq), types each of the statements. Finally, the (if ) and (while) rules check that the expression is a boolean expression and type their substatements. (ass) γ x : τ γ e : τ γ x := e : stm (tass) γ x[e1][e2] : τ γ e3 : τ γ x[e1][e2] := e3 : stm (skip) γ skip : stm (seq) γ S1 : stm γ S2 : stm γ S1; S2 : stm (if ) γ e : bool γ S1 : stm γ S2 : stm γ if e then S1 else S2 endif : stm (while) γ e : bool γ S : stm γ while e do S endwhile : stm Table 4.3: Plain type rules for simple statements Table 4.4, however, shows the type rules for the statements that con- 42
  • 43. cern cryptography and communication, and have a number of points worth investigating. The asymmetric cryptographic send and receive statements, shown in the rules (asend) and (arec), check that the key is an asymmetric key and compares types of the arguments with the types specified by the key. Symmetric cryptography and communication is typed in much the same way as asymmetric. There is the symmetric send, typed by (ssend), the simple symmetric receive, shown in (ssrec), and the symmetric receive and actfor statements. The rules differ from the rules for asymmetric cryptogra- phy in the type of the key. Furthermore, the type rule for the receive and act for statement, (srec), in addition to the rules which are identical to (ssrec), verifies that A is a principal, and checks the statement, S. Keys in the symmetric cryptographic communication must be instanti- ated, from a key declaration before they can be used, using the instantiate statement. This statement is checked by the (inst) rule which simply verifies that the specified key is a symmetric key. 4.1.3 Initialization, Keys, Processes, Key Declarations, and System For each initialization, (inum) through (itable) as shown in Table 4.5, the type rules create a map, mapping the variable name to the appropriate type. The only deviation is the type rule for the key initialization, (ikey), which looks up the key declaration in γ and uses it in the map for the key. The maps from each type rule are combined in (icomb) using the combination operator. In the same fashion as the type rule for the key initialization in Table 4.5, the type rules for the asymmetric keys, (pubk) and (prik) in Table 4.6, look up the key declaration for the key and use it in the map for the key. The maps for the asymmetric keys are also combined, in (akcomb), with the ∨ operator. The processes, shown in Table 4.7, have the large type proc in the same fashion as statements. The type rule for a process, (proc), combines the map for the initializations with the map for the keys and the global key declarations to form a map for all the variables defined for the process, this map is used when checking the statement S. The γ received by (proc) is used to check both the asymmetric keys and the initialization, variables that are defined multiple times are caught in the combination of the environments. In reality the asymmetric keys and the variables from the initialization cannot interfere, the asymmetric keys must be defined with either a + or − while normal variables cannot contain these characters. 43
  • 44. (asend) γ e1 : τ1 . . . γ en : τn γ(k+ ) = τ1 × . . . × τn → encrypt γ asend(e1, . . . , en){k+ } : stm (arec) γ e1 : τ1 . . . γ ej : τj γ xj+1 : τj+1 . . . γ xn : τn γ(k− ) = τ1 × . . . × τn → decrypt γ areceive(e1, . . . , ej; xj+1, . . . , xn){k− } : stm (ssend) γ e1 : τ1 . . . γ en : τn γ(k) = τ1 × . . . × τn → crypt γ ssend(e1, . . . , en){k} : stm (srec) γ e1 : τ1 . . . γ ej : τj γ xj+1 : τj+1 . . . γ xn : τn γ(k) = τ1 × . . . × τn → crypt γ A : principal γ S : stm γ sreceive(e1, . . . , ej; xj+1, . . . , xn){k} andactfor A in S endactfor : stm (ssrec) γ e1 : τ . . . γ ej : τ γ xj+1 : τj+1 . . . γ xn : τn γ(k) = τ1 × . . . × τn → crypt γ ssreceive(e1, . . . , ej; xj+1, . . . , xn){k} : stm (inst) γ(k) = τ1 × . . . × τn → crypt γ instantiate k : stm Table 4.4: Plain type rules for cryptographic and communicative state- ments 44
  • 45. (inum) γ x{L} := n : [x → int] (iprinc) γ x{L} := ’A’ : [x → principal] (itrue) γ x{L} := true : [x → bool] (ifalse) γ x{L} := false : [x → bool] (itable) γ x[n1][n2]{L} : [x → int × int → int] (ikey) γ(d) = τ1 × . . . × τn γ key k{L} using d : [k → τ1 × . . . × τn → crypt] (icomb) γ i1 : γ γ i2 : γ γ i1, i2 : γ ∨ γ Table 4.5: Plain type rules for the initialization (pubk) γ(d) = τ1 × . . . × τn γ k(d)+ : [k+ → τ1 × . . . × τn → encrypt] (prik) γ(d) = τ1 × . . . × τn γ k(d)− : [k− → τ1 × . . . × τn → decrypt] (akcomb) γ AK1 : γ γ AK2 : γ γ AK1, AK2 : γ ∨ γ Table 4.6: Plain type rules for the Asymmetric Keys (proc) γ AK : γ γ i : γ γ ∨ γ ∨ γ S : stm γ A[AK] : (i){S} : proc (plist) γ P1 : proc γ P2 : proc γ P1 P2 : proc Table 4.7: Plain type rules for the Processes (kd) declare d as {τ1{L}, . . . , τn{L}}{L} : [d → τ1 × . . . × τn] (kdcomb) KD1 : γ KD2 : γ KD1; KD2 : γ ∨ γ Table 4.8: Plain type rules for the Key Declarations 45
  • 46. Table 4.8 show the rules for the key declarations. A key declaration, (decl), declares a key format which is used by either an asymmetric or a symmetric key as shown above, the names of each key declaration is mapped to the composite type shown in the table. For several key declarations, (kdcomb), the maps for each are combined. (sys) KD : γ γ P : proc [KD]P : sys Table 4.9: Plain type rules for the System The type rule for the system, (sys) in Table 4.9, simply types the key declarations, KD, and passes the resulting map to the processes, P. 4.2 Type System for Security Annotations The security annotations of the gWhile language is based on the Decen- tralized Label Model discussed in Chapter 2. The use of a type system to statically evaluate information flow in a program is not new. In [VSI96] Volpano, Smith, and Irvine show a type system for checking a simple pro- gramming language with respect to information flow. Myers does check the Decentralized Label Model using a type system, [ML97], but uses the block label to prevent implicit flow as described in Section 4.2.1. This functionality was in [VS97] achieved using subtypes which is considerably more unwieldy. The type system described in this section is used to check the Decentral- ized Label Model as implemented in the gWhile language. It is structured much in the same fashion as the plain type system. Just as for the plain type system, the annotation type system has a type environment. However, the type environment for the annotation type system, λ, is the function λ : Var → Label The definitions of dom(λ) and λ(x) = L are the same as dom(γ) and γ(x) = τ in the plain type system, with the difference that λ(x) returns a label while γ(x) returns a basic type. In addition to the type environment the annotation type system carries two variables. The first is the set, ρ, which is used in declassification as shown in Table 4.10. ρ contains the current principal as well as any principals the current principal can act for at present, and is also referred to as the set of current principals. The second is the block label B which is described further in Section 4.2.1 below. The combination, or ∨, operator uses the same definition as described in Section 4.1, including the domain intersection condition. 46
  • 47. The rules for statements, processes, and the system use the large types defined for the plain system to indicate that they type, in the same fashion as those rules in the plain type system. 4.2.1 The Block Label Given a simple program segment with the variables h which is a high-security variable, and l which is low-security: if h = 0 then l := 0 else l := 1 Depending on the value of l something is known about h after the if statement has been executed. This is referred to as implicit information flow. To solve this problem an assignment inside an if statement can only take place if the variable which is being assigned to is more restrictive than both the value which is being assigned, and the expression which is branched on. For the statement l := 0 this would be: L0 Ll ∧ Lh=0 Ll which amounts to L0 Lh=0 Ll where Lh=0 is the label for the branching expression. If an assignment is nested inside several if statements, while loops, or similar then Lh=0 would of course have to be augmented with the labels for those expressions as well. The idea with the block label, which is the role Lh=0 played in the example above, is to initialize it with the ⊥ element. Each time a branch or loop statement is encountered the block label is augmented, with the label for the expression, using the operator. The augmented block label is then used to check the blocks of the branch or loop. Once the blocks have been checked the original block label is restored. The symmetric receive and act for statement both augments the block label and the set of current principals, ρ, before checking the statement S. Both these variables are restored when the statement of the act for statement has been checked. The type rules for the annotation type system have names in the same fashion as the plain type rules. They are, however, subscripted with L to signify the rules deal with labels, for examples intL. 47
  • 48. 4.2.2 Expressions All expressions carry the label map, λ, and the set of current principals, ρ. The label map is used for finding labels for simple variables or tables. The current principals are used in the declassify expression as described in Section 2.2. (intL) ρ; λ n : ⊥ (varL) ρ; λ x : L if λ(x) = L (thisL) ρ; λ this : ⊥ (princL) ρ; λ ’A’ : ⊥ (trueL) ρ; λ true : ⊥ (falseL) ρ; λ false : ⊥ (tableL) ρ; λ e1 : L1 ρ; λ e2 : L2 λ(x) = Lx ρ; λ x[e1][e2] : Lx L1 L2 (bopL) ρ; λ e1 : L1 ρ; λ e2 : L2 ρ; λ e1 bop e2 : L1 L2 (mopL) ρ; λ e : L ρ; λ mop e : L (declL) ρ; λ e : Le LA = {A : ∅|A ∈ ρ} Le L LA ρ; λ declassify(e, L) : L Table 4.10: Annotation type rules for expressions Apart from the type rule (declL) the type rules for expressions are pretty straightforward. The constant rules, (intL), (thisL), (princL), (trueL), and (falseL), all return the empty label shown as the ⊥ element. For variables, (varL), the corresponding label is fetched from λ. The rule for tables, (tableL), is somewhat similar to variables, in that the label for the table is fetched from λ here, too. The label for the value returned from the table, however, depends not only on the label for the table, but also on the labels for the two indexing expressions. The label returned from (tableL) is therefore the join of the three labels. Binary operators typed by (bopL), the +, =, and < operators in the gWhile language, return the join of the label of each expression. Monadic operators, (mopL), for example the not operator, simply return the label of the expression. While the random function is not as such an operator, its label is handled by the type rule for monadic operators. The (declL) rule for the declassification expression is, in contrast to the other type rules, a bit convoluted. In the Decentralized Label Model assign- ment of values to variables can only take place if the assignment constitutes a restriction. This means that eventually information can no longer flow. 48
  • 49. Following the restriction operator it is not possible to let another principal read data unless he is already in the effective reader set of the label. As described in Section 2.2, the declassify function allows the removal of an owner from a label, or the addition of reader to his reader set, provided the owner is in the set of current principals, ρ. 4.2.3 Statements In addition to λ and ρ the annotation type rules for statements carry the block label, B. (assL) ρ; λ e : Le ρ; λ x : Lx B Le Lx B; ρ; λ x := e : stm (tassL) ρ; λ e : Le ρ; λ e1 : L1 ρ; λ e : L1 ρ; λ x : Lx B Le L1 L2 Lx B; ρ; λ x[e1][e2] := e : stm (skipL) B; ρ; λ skip : stm (seqL) B; ρ; λ S1 : stm B; ρ; λ S2 : stm B; ρ; λ S1; S2 : stm (ifL) ρ; λ e : Le B Le; ρ; λ S1 : stm B Le; ρ; λ S2 : stm B; ρ; λ if e then S1 else S2 endif : stm (whileL) ρ; λ e : Le B Le; ρ; λ S : stm B; ρ; λ while e do S endwhile : stm Table 4.11: Annotation type rules for basic statements Some of the statements, the aforementioned “basic” statements, shown in Table 4.11 simply follow the Decentralized Label Model. The assignment, (assL), allows assignments if they constitute a restriction on the label for the expression and the block label as described in Section 4.2.1 above. The (skipL) always types, while the (seqL) rule types each of the state- ments and then returns the large type stm. The rules for branch statements, (ifL) and (whileL), augment the block label, B, by joining it with the label of the expression to prevent implicit information flow. The augmented block label is then used to check the block of the if or while statement. Worth noticing is the rule for the table assignment, (tassL), compared to the expression type rule for the table, (tableL) in Figure 4.10. Imagine a 49
  • 50. table, t, which everyone can read where the contents are known, a variable l with a low security level, and a variable h with a high level. Given the assignment l := t[1][h] It is clear that something can be learned about h from the value of l, an example of implicit flow as discussed before. The label for the expression t[1][h] is therefore dependent on the labels for both t, 1, and h as shown in the (tableL) rule of Table 4.10. Unfortunately, the inverse problem still exists, illustrated by the code t[1][h] := l If the rule described above is simply followed then t[1][h] is more re- strictive than l and the assignment is valid. A search in the table t for the value of l afterwards, however, yields information about the value of h. This problem is solved by letting the labels for the indexing expressions add to the label of the assigned expression, l. In the current example this means that the label for the table must be more restrictive than the label for h joined with the label for l. This rule is shown in (tassL) in Table 4.11 with the additional use of the block label to prevent implicit flow. As described in Section 3.2 the gWhile language contains both asymmet- ric and symmetric cryptographic primitives. (asendL) ρ; λ e1 : L1 . . . ρ; λ en : Ln λ(k+ ) = Lk1 × . . . × Lkn → Lk (∀i ∈ [1, n])(B Li Lki) B; ρ; λ asend(e1, . . . , en){k+ } : stm (arecL) ρ; λ e1 : L1 . . . ρ; λ ej : Lj ρ; λ xj+1 : Lj+1 . . . ρ; λ xn : Ln λ(k− ) = Lk1 × . . . × Lkn → Lk B = B L1 Lk1 . . . Lj Lkj (∀i ∈ [j + 1, n])(B Lki Li) B; ρ; λ areceive(e1, . . . , ej; xj+1, . . . , xn){k− } : stm Table 4.12: Annotation type rules for asymmetric cryptographic and com- munication statements The type rules for asymmetric and symmetric cryptographic communi- cation statements, as shown in Table 4.12 and Table 4.13 respectively, are quite similar. The type rule for asend, (asendL), is analogous to the one 50
  • 51. (ssendL) ρ; λ e1 : L1 . . . ρ; λ en : Ln λ(k) = Lk1 × . . . × Lkn → Lk (∀i ∈ [1, n])(B Li Lki) B; ρ; λ ssend(e1, . . . , en){k} : stm (srecL) ρ; λ e1 : L1 . . . ρ; λ ej : Lj ρ; λ xj+1 : Lj+1 . . . ρ; λ xn : Ln λ(k) = Lk1 × . . . × Lkn → Lk B = B L1 Lk1 . . . Lj Lkj (∀i ∈ [j + 1, n])(B Lki Li) A ∈ owners(Lk) B ; ρ ∪ {A}; λ S : stm B; ρ; λ sreceive(e1, . . . , ej; xj+1, . . . , xn){k} andactfor A in S endactfor : stm (ssrecL) ρ; λ e1 : L1 . . . ρ; λ ej : Lj ρ; λ xj+1 : Lj+1 . . . ρ; λ xn : Ln λ(k) = Lk1 × . . . × Lkn → Lk B = B L1 Lk1 . . . Lj Lkj (∀i ∈ [j + 1, n])(B Lki Li) B; ρ; λ ssreceive(e1, . . . , ej; xj+1, . . . , xn){k} : stm (instL) B; ρ; λ instantiate k : stm Table 4.13: Annotation type rules for symmetric cryptographic and com- munication statements 51
  • 52. for ssend, (ssendL), and likewise for areceive and ssreceive, with the rules (arecL) and (ssrecL). Most of the type rule for the receive and act for statement, (srecL), is the same as (ssrecL), the difference being in the augmentation of ρ. When a number of values are sent it is useful to think of a series of assignments taking place, from the values to the fields of the send statement. The typical rule for an assignment is B Le Lx where Le is the label for the expression and Lx is the label for the assignee. For each field in the key format, i ∈ [1, n], a similar rule, B Li Lki, is given. In the rule each field has an associated label, Lki, and each value has a label, Li. For the receive statements pattern matching is used which makes the type rules a bit different. The first j expressions are used for matching the pattern, while the variables specified for the remaining n − j fields are assigned to. The assignment to these variables use a rule much the same as for send statements and normal assignments, B Lki Li. Worth noticing, however, is that the assignment is to the variables from the fields, hence the reversal of Lki and Li. Furthermore, the block label, B , is an augmentation of the normal block label B. The receive statement is only executed if the first j fields match, these fields are in a way conditions on the statement. The block label is therefore enlarged the same way it would have been for a series of nested if statements, each containing an equality condition corresponding to the fields and values. The augmented block label, B , is also used in the verification of the statement, S, in the receive and act for statement of (srecL). Additionally, the receive and act for statement adds the principal A to ρ before checking S. Both the block label and ρ are restored before the statement following the receive statements are checked. 4.2.4 Initialization, Keys, Processes, Key Declarations, and System Table 4.14 shows the type rules for the initialization part of a process. It is very similar to the type rules for the plain type system shown in Section 4.1, except that variable names map to labels instead of types. Worth noticing is that the map for a symmetric key, as shown in (ikeyL), is the same as the map for its key declaration. The same is the case for the asymmetric keys shown in Table 4.15. A process, typed by (procL), receives a type environment from the key declarations through the system, types the asymmetric keys and the initial- ization with respect to λ, and use the combination of the three environments together with the ⊥ element for the block label and ρ as a singleton set of the current principal, for checking the statement. The list of processes, (plistL), are simply checked one at the time with the environment λ as generated from the key declarations. The key declarations, Table 4.17, define the formats that can be used for 52
  • 53. (inumL) λ x{L} := n : [x → L] (iprincL) λ x{L} := ‘A’ : [x → L] (itrueL) λ x{L} := true : [x → L] (ifalseL) λ x{L} := false : [x → L] (itableL) λ x[n1][n2]{L} : [x → L] (ikeyL) λ(d) = L1 × . . . × Ln → L λ key k using d : [k → L1 × . . . × Ln → L] (icombL) λ i1 : λ λ i2 : λ λ i1, i2 : λ ∨ λ Table 4.14: Annotation type rules for the initialization (pubkL) λ(d) = L1 × . . . × Ln → L λ k(d)+ : [k+ → L1 × . . . × Ln → L] (prikL) λ(d) = L1 × . . . × Ln → L λ k(d)− : [k− → L1 × . . . × Ln → L] (akcombL) λ AK1 : λ λ AK2 : λ λ AK1, AK2 : λ ∨ λ Table 4.15: Annotation type rules for the Asymmetric Keys (procL) λ i : λ λ AK : λ ⊥; {A}; λ ∨ λ ∨ λ S : stm λ A[AK] : (i){S} : proc (plistL) λ P1 : proc λ P2 : proc λ P1 P2 : proc Table 4.16: Annotation type rule for the processes 53
  • 54. (kdL) declare d as {T1{L1}, . . . , Tn{Ln}}{L} : [d → L1 × . . . × Ln → L] (kdcombL) KD1 : λ KD2 : λ KD1; KD2 : λ ∨ λ Table 4.17: Annotation type rule for the key declarations the keys of the asymmetric and symmetric cryptography and communication statements. Each declaration is entered into the variable map, λ, with a composite type similar to the table type in the normal type system. A key declaration has, in the annotation type system, the type L1 × . . . × Ln → L. The labels for each of the n fields together yield the label of the encrypted package. Table 4.13 and Table 4.12 shows how this is used in the cryptographic statements. (sysL) KD : λ λ P : proc [KD]P : sys Table 4.18: Annotation type rule for the system The system, typed by (sysL) in Table 4.18, simply types the key decla- rations to get λ which is used in checking the list of processes. 4.3 Type Matching Communications Analysis The Type Matching Communications Analysis, TMCA, is a simple analysis which notes occurrences of communication statements in a program, and attempts to find out if they are matched. A communication statement is matched if there is another communication statement such that the commu- nication can be carried out. The analysis begins by traversing the program recording a some of in- formation concerning each communication statement. The following pieces of information are gathered: • The type of operation, send or receive • Asymmetric or symmetric cryptography • The key declaration for the key used in the communication • Iteration, is it inside a branch, a loop, or in the normal flow • The principal for the current process 54
  • 55. After this information has been collected, for each communication state- ment, it is matched to verify that everything that is communicated is matched. The nature of the analysis, however, is that it is an over-approximation. This means that there may be communications that is matched by the rules but does not have a match if the program was executed. However, if a commu- nication statement is not matched it really cannot be matched. The assertion that the analysis is an over-approximation is connected to the matching rules below. In itself, the gathering of information is mute towards the nature of the analysis, but the rules for iteration are constructed in a way that, if at all possible, it will try to match a statement. In contrast, an under-approximation would match only those statements that it could, with absolute certainty, be sure would actually communicate. In this case there might be statements that still matched, but if the analysis said two statements match, they would match in the execution of the program. The data gathered is insufficient to allow the matching rules to make this kind of distinction. 4.3.1 Matching Rules The tuples of information recorded for each communication statement must be matched after a rigid number of criteria: 1. A send statement must be matched to a receive statement, and vice versa 2. Communication statements can only match statements of the same type, asymmetric matches asymmetric, symmetric matches symmetric 3. The key declarations must be the same 4. Since communication is synchronous a statement cannot be matched to other statements from the same process These rules leave the iteration information. When this information is gathered loops have higher precedence than branches which again have higher precedence than the normal flow. In other words, it does not matter if a while loop is inside an if statement, or if an if statement is inside a while loop, a communication statement on the inner most level may be performed zero or more times. In matching the iteration communication statements in the normal flow take highest precedence, since it is not known whether a statement inside a branch or loop will even be executed it is more important to see if those statements in the normal flow can be matched. Given a program with ssend(x, b){k} 55
  • 56. for process A and while b do ssreceive(x; b){k} endwhile; ssreceive(x; b){k} for process B, then the loop in process B may never be executed and the analysis should approve this program. The above leads to a matching algorithm where the occurrence of a statement in the normal flow opposite a branch or loop will first try and match other communication statements in the normal flow first. Only if the statement does not match other statements in the normal flow will it match the branch or loop opposite it. If a branch matches another statement (after checking as described above) it is removed in the same fashion as a statement in the normal flow. A loop, however, is pressed onto the list of statements to match again, so it is matched against all the other statements because a loop may be executed an arbitrary number of times. If a statement cannot be matched, an appropriate error message is out- put. 56
  • 57. CHAPTER 5 Implementation To test the design of the language, type systems, and simple analysis, an implementation of each was carried out. The implementation is based on the Moscow ML implementation of Stan- dard ML [Sesff]. The strong pattern matching, high order functions, and type features of SML made it an obvious choice. The Moscow ML imple- mentation was chosen for its availability on a large number of platforms, and use in previous work. In this chapter some knowledge of SML is assumed. For further infor- mation see Introduction to Programming using SML [HR99] or similar. In the implementation of the typechecker the modules Set and Table are used [HR99, Appendix E]. After the discussion of the parser, Section 5.1, parse tree for the gWhile language, Section 5.2, and the implementation of the type systems, Sec- tion 5.3, Section 5.4 describes the Type Matching Communications Analy- sis. The testing procedure for the implementation is described in Section 5.5. Finally, the implementation of Battleships is discussed in Section 5.6. 5.1 Parsing the gWhile Language The parser for the gWhile language is implemented in variants of Lex and Yacc for SML. For all intents and purposes the Lex and Yacc versions for SML are the same as those for the C programming language. The parser functions are inspired by earlier implementations [TH03]. In the implementation of the parser there have been a few changes to prevent shift/reduce and reduce/reduce conflicts. These changes have car- ried over to the abstract syntax shown in Section 3.1, and it is possible to 57
  • 58. directly implement the abstract syntax using Lex and Yacc. There have been other changes, however, that have been made to ease the traversal of the parse tree. The abstract syntax for the key declarations, processes, asymmetric keys, initializations, and labels imply that they have a tree structure when there are several of them. For example, two processes are shown in the abstract syntax as a process tree which has two branches, each containing a process. In the implementation these trees are represented as lists as shown in Section 5.2. At times it is useful to allow the else clause of an if statement to be omitted, this has been implemented in the parser. The returned syntax tree for such an if statement is equivalent to an if statement where the else branch contains a skip statement and nothing else. Lastly, there is the matter of type specification in the key declarations. It is quite difficult to write int × int → int using only ascii characters, and similarly for the τ1 × . . . × τn → crypt type for symmetric keys. In the implemented syntax the two-dimensional array, or table, is represented with the keyword table. The symmetric keys, however, are a bit different. Since a symmetric key is based upon a key declaration, and the key declarations are global and not used elsewhere in the declarations themselves. A symmetric key is, in a key declaration, denoted by the name of the key declaration it corresponds to. For example, the following key declaration can be given declare d1 as {int{A:}, bool{B: A, C}}{A: all} A key declaration for communicating this key as a session key would then have the form declare d2 as {table{B: A}, d1{B: C}}{B: all} This syntax means that the key declaration d1 must be declared before d2, since it is referenced there. The remaining available types in a key declaration are: int, bool, and principal for numbers, boolean values, and principals respectively. It is also difficult to write subscript as used in the asymmetric keys. A public key in use is therefore simply written as k+, and declared as k(d2)+, if it should use the key declaration d2. 5.2 The gWhile Parse Tree As mentioned above, the implementation of the gWhile language differs in some respects from the abstract syntax shown in Section 3.1. This sec- tion describes the datatypes for the implemented parse tree of the gWhile language. 58
  • 59. type label = (string ∗ string list) list; (∗ owner: reader1, reader2, ...; owner2: ... ∗) Figure 5.1: Datatype for labels Figure 5.1 show the datatype for labels. A label {A : B; B : A, C} is in the parse tree written as [(”A”, [”B”]), (”B”, [”A”, ”C”])] Labels are discussed further in Section 5.3.2. datatype expr = NUM of int | VAR of string | THIS | PRINC of string | TABLE of string ∗ expr ∗ expr | TRUE | FALSE | RAND of expr | NOT of expr | ADD of expr ∗ expr | EQ of expr ∗ expr | LT of expr ∗ expr | DECL of expr ∗ label ; Figure 5.2: Datatype for expressions The datatype for expressions, shown in Figure 5.2, closely follow the abstract syntax. The variables and table have their name which is used in the type system to look up their values and label. This, true, and false denote their values directly, while the remaining expressions either have a value or associated expressions which can be followed in a tree structure. The statements in Figure 5.3 also follow the abstract syntax. Note the use of lists for the expressions and variables in the communication state- ments. The variables of the receive statements are denoted as strings to ensure that they are simply variable names. An assignment assigns to the first expression from the second, it is the job of the type checker to ensure that the first expression is a variable or table index. The SRECETC type is the 59
  • 60. datatype stmt = ASS of expr ∗ expr | SKIP | SEQ of stmt ∗ stmt | IF of expr ∗ stmt ∗ stmt | WH of expr ∗ stmt | ASEND of expr list ∗ string | AREC of expr list ∗ string list ∗ string | SSEND of expr list ∗ string | SRECETC of expr list ∗ string list ∗ string ∗ string ∗ stmt | SSREC of expr list ∗ string list ∗ string | INST of string ; Figure 5.3: Datatype for statements datatype init = INUM of string ∗ int ∗ label | IPRI of string ∗ string ∗ label | ITRU of string ∗ label | IFAL of string ∗ label | ITAB of string ∗ int ∗ int ∗ label | IKEY of string ∗ string ∗ label ; Figure 5.4: Datatype for the initialization 60
  • 61. symmetric receive and act for statement, the final string of that statement is the principal which the current process wants to act for. For the initialization in Figure 5.4 the first string for all the types is the name of the variable, and furthermore, each variable also has a label. A Number, simply has its numeric value. A principal variable has a string containing the name of the principal. Boolean variables are initialized specif- ically with a truth value. The table has its dimensions as numbers. And the key has the name of its associated key declaration. datatype akey = PUBK of string ∗ string | PRIK of string ∗ string ; Figure 5.5: Datatype for the asymmetric keys Figure 5.5 shows the two types for the asymmetric keys. Both the public and private key has a type in the datatype. The first string is the identifier of the key, which is combined with a + for a public key and a − for a private key, to create the name of the key. The second is the key declaration for the key. datatype process = PROC of string ∗ akey list ∗ init list ∗ stmt ; datatype keydecl = KD of string ∗ keylabel list ∗ label ; datatype sys = SYS of keydecl list ∗ process list ; Figure 5.6: Datatype for processes, key declarations, and system The final three datatypes, shown in Figure 5.6, are the processes, key declarations, and the system. A process has an identifier or principal, a list of asymmetric keys, a list of initializations, and a statement tree. The use of lists differ from the tree structure of the abstract syntax as described in Section 5.1 above. SML has a large number of function that work on lists and make list traversal more elegant than tree traversal. A key declarations is comprised of a name, a list of type names and labels, and the label of the encrypted package. The list in the key declaration is 61
  • 62. the format of the associated keys and is simply list with fields of the format string ∗ label The system contains a list of key declarations and a list of processes. 5.3 Type System Implementation In the implementation of the type systems the two systems, as described Section 4.1 and Section 4.2, were combined and implemented as one. While the block label, B, and the current principals, ρ, were unchanged the variable maps, λ and γ, were combined to one map. This new map, also called γ, has the format γ : Var → τ × Label for each variable x with the type τ and the label L. The value of τ would reside in the old map γ, while the label would be stored in λ. The new γ is represented by the SML type (string, basic type ∗ label)table 5.3.1 Data Types The implementation is built around the basic_type data-type which is shown in Figure 5.7. These types are analogous to the basic types in Sec- tion 4.1. datatype basic type = T INT | T BOOL | T PRINCIPAL | T TABLE of basic type ∗ basic type ∗ basic type | T KEYDECL of (basic type ∗ label) list | T KEY of (basic type ∗ label) list ∗ (basic type ∗ label) | T CRYPT | T ENCRYPT | T DECRYPT | T ERROR ; Figure 5.7: The basic type datatype While the types for integers, booleans, and principals are quite straight- forward, the remaining require a little bit of explanation. The type for a table is initialized as 62

Related Documents