Archives

Context-Aware Scanning and Determinism-Preserving Grammar Composition, in Theory and Practice

This thesis documents several new developments in the theory of parsing, and also the practical value of their implementation in the Copper parser generator.
The most widely-used apparatus for syntactic analysis of programming languages consists of a scanner based on deterministic finite automata, built from a set of regular expressions called the lexical syntax, and a separate parser, operating on the output of this scanner, built from a context-free grammar and utilizing the LALR(1) algorithm.

While the LALR(1) algorithm has the advantage of guaranteeing a non-ambiguous parse, and the approach of keeping the scanner and parser separate make the compilation process more clean and modular, it is also a brittle approach. The class of grammars that can be parsed with an LALR(1) parser is not closed under grammar composition, and small changes to an LALR(1) grammar can remove the grammar from the LALR(1) class. Also, the separation of scanner and parser prevent the use, in any organized way of parser context to resolve ambiguities in the lexical syntax.

One area in which both of these drawbacks pose a particular problem is that of parsing embedded and extensible languages. In particular, it forms one of the last major impediments to the development of an extensible compiler in which language extensions are imported and composed by the end user (programmer) in an analogous manner to the way libraries are presently imported. This is due not only to the problem of the LALR(1) grammar class not being closed under composition, but to the very real possibility that the lexical syntax of two different extensions will clash, making it impossible to construct a scanner without extensive manual resolution of ambiguities, if at all.

This thesis presents three innovations that are a large step towards eliminating parsing as an Achilles’ heel in this application. Firstly, it describes a new algorithm of scanning called context-aware scanning, in which the scanner at each scan is made aware of what sorts of tokens are recognized as valid by the parser at that point. By allowing the use of parser context in the scanner to disambiguate, context-aware scanning makes the specification of embedded languages much simpler — instead of specifying a scanner that must explicitly switch “modes” to scan on the different embedded languages, one simply compiles a context-aware scanner from a single lexical specification, which has implicit “modes” to scan properly on each embedded language. Similarly, in an extensible language, this puts a degree of separation between the lexical syntax of different language extensions, so that any clashes of this sort will not be an issue.

Secondly, the thesis describes an analysis that operates on grammar extensions of a certain form, and can recognize their membership in a class of extensions that can be composed with each other and still produce a deterministic parser—enabling end-users to compose extensions written by different language developers with this guarantee of determinism. The analysis is made practical by context-aware scanning, which ensures a lack of lexical issues to go along with the lack of syntactic nondeterminism. It is this analysis — the first of its kind — that is the largest step toward realizing the sort of extensible compiler described above, as the writer of each extension can test it independently using the analysis and thus resolve any lexical or syntactic issues with the extension before the end user ever sees it.

Thirdly, the thesis describes a corollary of this analysis, which allows extensions that have passed the analysis to be distributed in parse table form and then composed on-the-fly by the end users, with the same guarantee of determinism. Besides expediting the operation of composition, this also enables the use of the analysis in situations where the writer of a language or language extension does not want to release its grammar to the public.

Finally, the thesis discusses how these approaches have been implemented and made practical in Copper, including runtime tests, implementations and analyses of real-world grammars and extensions.

Read More

Device for Preventing, Detecting and Responding to Security Threats

  • Steven A. Harp
  • Tom Haigh
  • Johnathan Gohde
  • Richard O'Brien

A device to prevent, detect and respond to one or more security threats between one or more controlled hosts and one or more services accessible from the controlled host. The device determines the authenticity of a user of a controlled host and activates user specific configurations under which the device monitors and controls all communications between the user, the controlled host and the services. As such, the device ensures the flow of only legitimate and authorized communications. Suspicious communications, such as those with malicious intent, malformed packets, among others, are stopped, reported for analysis and action. Additionally, upon detecting suspicious communication, the device modifies the activated user specific configurations under which the device monitors and controls the communications between the user, the controlled host and the services.

Read More

XEBHRA: A Virtualized Platform for Cross Domain Information Sharing

    The Unified Cross Domain Management Office (UCDMO) states that its mission is to provide coordination and oversight for the cross domain community’s vision of “secur[ing] cross domain access to and sharing of timely and trusted information, creating a seamless Enterprise that enables decision advantage.” The UCDMO defines three types of cross domain solution (CDS) — transfer, access and multi-level — to satisfy this vision.
    The transfer CDS, or guard, moves information securely between software applications running in different information security domains. Since the guard must approve all information flows between domains, it is traditionally deployed on a standalone computer host that provides the only physical link between the domains’ networks. This deployment strategy ensures that the guard cannot be bypassed. Unfortunately, as the demand for sharing increases, this strategy can prove costly.  Data centers, for example, may charge more for custom guard hardware that cannot be reallocated easily for other uses

    Read More

    Even harmonious labelings of disjoint graphs with a small component

    A graph G with q edges is said to be harmonious if there is an injection f  from the vertices of G to the group of integers modulo q such that when each edge xy is assigned the label f (x) + f (y) (mod q), the resulting edge labels are distinct. If G is a tree, exactly one label may be used on two vertices. Over the years, many variations of harmonious labelings have been introduced.
    We study a variant of harmonious labeling. A function f  is said to be a properly even harmonious labeling of a graph G with q edges if f  is an injection from the vertices of G to the integers from 0 to 2 (q-1) and the induced function f*  from the edges of G to 0,2,…,2 (q-1) defined by f* (xy) = f (x) + f (y) (mod 2q) is bijective. We investigate the existence of properly even harmonious labelings of families of disconnected graphs with one of C3, C4, K4 or W4as a component.

    Read More

    Architecture Modeling and Analysis for Safety Engineering

    Architecture description languages such as AADL allow systems engineers to specify the structure of system architectures and perform several analyses over them, including schedulability, resource analysis, and information flow. In addition, they permit system-level requirements to be specified and analyzed early in the development process of airborne and ground-based systems. These tools can also be used to perform safety analysis based on the system architecture and initial functional decomposition. Using AADL-based system architecture modeling and analysis tools as an exemplar, we extend existing analysis methods to support system safety objectives of ARP4754A and ARP4761. This includes extensions to existing modeling languages to better describe failure conditions, interactions, and mitigations, and improvements to compositional reasoning approaches focused on the specific needs of system safety analysis. We develop example systems based on the Wheel Braking System in SAE AIR6110 to evaluate the effectiveness and practicality of our approach.

    Read More

    Properly even harmonious labelings of disconnected graphs

    A graph G with q edges is said to be harmonious if there is an injection f  from the vertices of G to the group of integers modulo q such that when each edge xy is assigned the label f (x) + f (y) (mod q), the resulting edge labels are distinct. If G is a tree, exactly one label may be used on two vertices. Over the years, many variations of harmonious labelings have been introduced.
    We study a variant of harmonious labeling. A function f  is said to be a properly even harmonious labeling of a graph G with q edges if f is an injection from the vertices of G to the integers from 0 to 2 (q-1) and the induced function f*  from the edges of G to 0 to 0,2,…,2 (q-1) defined by f* (xy) = f (x) + f (y) (mod2q) is bijective. This paper focuses on the existence of properly even harmonious labelings of the disjoint union of cycles and stars, unions of cycles with paths, unions of squares of paths, and unions of paths.

    Read More

    AADL Annex for the FACE™ Technical Standard, Edition 3.0

    This annex is intended to help component vendors and system integrators using the (Future Airborne Capability Environment) FACE Technical Standard. FACE Technical Standard Edition 3.0 provides a data modeling architecture but does not provide mechanisms for describing component behavior or timing properties. This document provides guidance for translating a FACE Standard Edition 3.0 Data Architecture XMI model into AADL so that behavior and timing properties can be added and analyzed.

    This annex supports the modeling, analysis, and integration of FACE artifacts in AADL. It gives AADL style guidelines and an AADL property set to provide a common approach to using AADL to express architectures that include FACE components. Using common properties and component representations in AADL makes AADL models of FACE components portable and reusable and increases the utility of tools that operate on such AADL models.

    Read More

    Safety Annex for the Architecture Analysis and Design Language

    • Danielle Stewart
    • Jing Liu
    • Darren Cofer
    • Mats Heimdahl
    • Michael W. Whalen
    • Michael Peterson

    Model-based development tools are increasingly being used for system-level development of safety-critical systems. Architectural and behavioral models provide important information that can be leveraged to improve the system safety analysis process. Model-based design artifacts produced in early stage development activities can be used to perform system safety analysis, reducing costs, and providing accurate results throughout the system life-cycle. In this paper we describe an extension to the Architecture Analysis and Design Language(AADL) that supports modeling of system behavior under failure conditions. This Safety Annex enables the independent modeling of component failures and allows safety engineers to weave various types of fault behavior into the nominal system model. The accompanying tool support uses model checking to propagate errors from their source to their effect on top-level safety properties without the need to add separate propagation specifications. Our tools are also able to compute minimal cutsets for these errors to produce faults trees familiar to safety engineers and certification authorities. We describe the Safety Annex, illustrate its use with a representative example, and discuss and demonstrate the tool support enabling an analyst to investigate the system behavior under failure conditions.

    Read More

    Architecture Centric Virtual Integration Process (ACVIP): A Key Component of the DoD Digital Engineering Strategy

    Challenging problems associated with system software complexity growth are threatening industry’s ability to build next generation safety- and security-critical embedded cyber physical weapon systems including vertical lift avionics systems. Contributors to these problems include the growth of software enabled capabilities, interaction complexity in system integration, and ambiguous, missing, incomplete, and inconsistent requirements. Problems continue to hamper systems in the areas of resource utilization, timing and scheduling, concurrency and distribution, and safety and security. A new approach called Architecture Centric Virtual Integration Process (ACVIP), based on the SAE International® Aerospace Standard AS5506C Architecture Analysis and Design Language (AADL), is being developed and investigated by the United States (US) Army to address these challenges. ACVIP is a compositional, quantitative, architecture-centric, model-based approach enabling virtual integration analysis in the early phases and throughout the lifecycle to detect and remove defects that currently are not found until software, hardware, and systems integration and acceptance testing. The Science & Technology (S&T) program called Joint Multi-Role (JMR) Technology Demonstrator (TD) with the Mission System Architecture Demonstration effort is developing, piloting, evaluating and maturing Modular Open Systems Approach (MOSA), a Comprehensive Architecture Strategy (CAS), and Model Based Engineering (MBE) including ACVIP through a number of projects with contractor teams to prepare for the Future Vertical Lift (FVL) family-of-systems. ACVIP plays a key role in addressing issues in cyber-physical systems (CPS) and can be a key contributor to the US Department of Defense (DoD) Digital Engineering Strategy. It provides a well-defined standard as a foundation for a commercial tool marketplace, a ready base for ongoing efforts in maturation and commercialization of the technology, provides early demonstrations of success, and a unique architectural contribution to authoritative source of truth (ASoT). We will first discuss the challenges in CPS development and the contribution ACVIP makes to address these challenges. We then outline how ACVIP is a key component that contributes to all five goals of the DoD Digital Engineering Strategy.

    Read More

    Model-Based Compliance Testing of PKCS#11 Providers

    PKCS#11, also known as Cryptoki, is a widely adopted C-language API and interoperability standard for communicating with cryptographic libraries. While comprehensive and prescriptive, the standard is also extremely complex. The base specification alone describes approximately 50 functions through over 100 pages of documentation. Add-on specifications provide additional functionality, but also impose additional complexity. As the standard evolves through collaboration and expansion, new areas of imprecision and ambiguity are introduced which make it difficult for vendors to implement libraries that are 100% functionally accurate and compliant with the specification. Users of cryptographic libraries are familiar with the industry reality that PKCS#11 libraries will not always interoperate flawlessly with each other. In the best case, these divergences result in development delays as build issues and functional deficiencies are root-caused and remedied. In the worst case, customers may face production outages or data corruption due to incorrect assumptions or imperfect testing.

    To address this problem at its core, Galois has developed an API specification and testing framework that captures the complex behaviors of Cryptoki in a mathematical specification language. This specification, in turn, is used to automatically synthesize a test suite that enforces compliance with (a mathematical model of) the PKCS#11 standard. The approach, known as model-based testing, was used to generate a corpus of PKCS#11 compliance tests that provide complete test coverage over the formal model.

    While model-generated tests are designed to enforce compliance with the PKCS#11 standard, they have also proven effective at keeping bugs out of production. Error handling scenarios defined in the standard often represent corner cases which, if not properly handled by implementation code, will cause program crashes or other serious defects. Because the aim of model-based testing is to generate test cases for all the behaviors in the standard, errors of this variety are naturally uncovered with compliance tests. As a result, assuring PKCS#11 libraries with model-generated compliance tests leads to client applications that are portable, more robust and have fewer security vulnerabilities.

    Read More