Tech Talk: ROSE Open Compiler Infrastructure

The April 2nd Galois Tech Talk  will be delivered by Daniel J. Quinlan, titled “ROSE Open Compiler Infrastructure for Software Analysis, Transformation, and Optimization”.

  • Date: Thursday, April 2nd, 2009
  • Time: 10:30am – 11:30am
  • Location: Galois, Inc.421 SW 6th Ave. Suite 300(3rd floor of the Commonwealth Building)Portland, OR 97204

(NB. This seminar is on Thursday, instead of the usual Tuesday slot.)Abstract:  ROSE is a tool for building source-to-source transformation tools for the custom analysis, optimization, and transformation of large scale C, UPC, C++ and Fortran (F2003,F95,F90,F77,F66) applications, and also OpenMP.  Recent work over the last two years has added binary analysis support to ROSE (specifically support for x86, ARM, and PowerPC instruction sets and both Windows (PE, NE, LE, DOS formats)and Linux (Elf format). More recent work has added dynamic analysis support using “Intel Pin” to mix both static and dynamic analysis.  ROSE has an external community of users, developers, and collaborators and has been used in a number of research and industry tools (more information is available at www.roseCompiler.org). It has been a basis for a number of external collaborations in program optimization and cyber-security and we invite new ones.Specifically, ROSE is packaged to allow construction of custom tools by a non-compiler audience. Central to ROSE is the analysis and transformation of the Abstract Syntax Tree (as well as other graphs generated from it) and its transformation to generate new code. Research work has addressed a range of topics from optimization (loop optimization, MPI optimization, data structure optimization, automated parallelization using OpenMP, etc.) to the details of macro handling in CPP preprocessing and advanced token handling for extremely precise levels of source-to-source code regeneration and analysis of subtle details typically not possible within conventional compiler infrastructures.Ongoing research has focused on the use of ROSE within Cyber-Security research and the development of supporting program analysis and support for detection of general properties of both source code and binaries. Current work includes the representation and analysis of binaries and the leveraging of source-code based program analysis for use in binary analysis. Ongoing research work within ROSE is to present a uniform program analysis tool chain for both source code and binaries.Recent work has developed Compass, a tool built on top of ROSE.  Compass is a useful tool on its own and also a vehicle for research on the specification of code patterns (secure coding rules) and the detection of rule violations in large scale applications. This talk will present ROSE, Compass, and recent research work on shared and distributed memory parallel static analysis techniques to scale the detection of rule violations for potentially large rule sets (thousands of rules) on large-scale (many-million line) applications.  The exact same infrastructure can also be used for binary analysis, but secure code rule sets for binaries are not as well developed as for source code. This talk will outline some of the work within ROSE generally and with particular emphasis on building research collaborations.Bio: Dan Quinlan is project leader for the ROSE Project at Lawrence Livermore National Laboratory. His research is in numerous areas that intersect computer science and numerical analysis. Research interests include: semantics-based source code transformations and optimizations, C++ compiler tools/infrastructure/design, cache-based optimizations, object oriented abstractions, parallel array classes, parallel data distribution mechanisms, parallel load balancing algorithms, object-oriented numerical frameworks, parallel adaptive mesh refinement, and parallel multigrid algorithms. Quinlan earned his Ph.D. in Computational Mathematics from University of Colorado.


Galois has been holding weekly technical seminars for several years on topics from functional programming, formal methods, compiler and language design, to cryptography, and operating system construction, with talks by many figures from the programming language and formal methods communities. The talks are open and free. An RSVP is not required, but feel free to contact the organizer with questions and comments.

Read More

One Million Haskell Downloads…

Galois engineers write a lot of Haskell (in fact, our technology catalogue is built pretty much entirely on it). We find we’re able to build systems faster, with fewer errors, and in turn are able to apply techniques to increase assurance, helping us deliver value to our clients. We’ve successfully engineered large systems in the language for nearly a decade. We also use and write a lot of open source Haskell code. Since 2004 we’ve been investing in improving packaging and distribution infrastructure for Haskell code, and since 2007 Galois has been hosting hackage.haskell.org: the central online database of open source Haskell libraries and applications. These packages are built via Cabal (dreamed up by Galois’ own Isaac Potoczny-Jones), and distributed via cabal-install. Hackage now hosts more than 1100 released libraries and tools, and has been growing rapidly (and, incidentally, Galois employees have released or been significant contributors to just shy of 10% of all Hackage projects).We’ve wondered for a while now just how busy Hackage was becoming, and in turn, what other interesting information about Haskell were buried in the Hackage logs. This post answers those questions for the first time. We’ll see

  • Total, and growing, Haskell source downloads
  • The most popular Haskell projects hosted on Hackage
  • The most popular development categories
  • The most popular methods for distributing Haskell source

and speculate a little on where Hackage is heading.

Background

We’ve known for a while that uploads to Hackage were growing. You might have seen this graph elsewhere (it’s derivable from the RSS logs of package uploads):

There’s a pretty clear trend upwards. Average daily Hackage releases have increased 4-fold since Hackage was launched, and it’s now averaging 10 packages a day released. The question is: was anyone using this code?

Read More

Solving Sudoku Using Cryptol

Cryptol is a language tailored for cryptographic algorithms. Sudoku is a popular puzzle the reader  is no-doubt already familiar with. We will offer no deep reason why anyone should try to solve Sudoku in Cryptol; other than the very fact that it’d be a shame if we couldn’t!Needless to say, Cryptol has not been designed for encoding search algorithms. Nonetheless, some of the features of Cryptol and its associated toolset make it extremely suitable for expressing certain constraint satisfaction problems very concisely; and Sudoku very nicely falls into this category.

Representing the board

A Sudoku board can be represented in a variety of ways. We will pick the simplest: A sequence of 9 rows, each of which has 9 elements storing the digits. Each digit will require 4 bits; since they range from 1 to 9. So, a good Cryptol type for a board is:

  [9][9][4]

In Cryptol-speak, this type simply represents a sequence of precisely 9 elements, each of which is a sequence of 9 elements themselves, each of which are 4-bit words. (Technically, the type [4] also represents a sequence of precisely 4 elements, each of which are bits. But it’s easier to read that as 4-bit words. The type [4] and [4]Bit are synonymous in Cryptol, and can be used interchangeably in all contexts.)

Read More

Tech Talk: Fun with Dependent Types

The March 24th Galois Tech Talk was delivered by Aaron Tomb, titled “Fun with Dependent Types.”

  • Date: Tuesday, March 24, 2009
  • Time: 10:30am – 11:30am
  • Location: Galois, Inc.421 SW 6th Ave. Suite 300(3rd floor of the Commonwealth Building)Portland, OR 97204

Here are Aaron’s slides. Further material on this topic can be found on Kenn Knowles’s site.Abstract: A number of dependently-typed programming languages exist, but many either restrict expressiveness or require extensive user input to deal with the undecidability of type checking. Languages such as Cayenne, lambda-H, and Sage have instead used a “best-effort” attempt to deal with this undecidability by attempting to type check programs, but potentially failing to prove valid programs type-correct.One powerful (and undecidable) form of dependent typing is based on what are variously known as contract types, refinement types, or predicate subtypes. The lambda-H language uses refinement types alone, and Sage includes them as part of a “pure” type system that uses the same syntax to describe both terms and types.An interesting recent result (by one of my friends from Santa Cruz) shows that while type checking for refinement types is undecidable, a form of type inference is decidable. It has the interesting property that if the input program is well-typed, then it has the inferred type. However, the algorithm does not determine whether the input program is, in fact, well-typed. Because it only decides one part of the type inference problem, the authors refer to it as “type reconstruction” instead.I will talk about refinement types, existing techniques for checking them, and the basics of decidable refinement type reconstruction.


Galois has been holding weekly technical seminars for several years on topics from functional programming, formal methods, compiler and language design, to cryptography, and operating system construction, with talks by many figures from the programming language and formal methods communities. The talks are open and free. An RSVP is not required, but feel free to contact the organizer with questions and comments.

Read More

Call for Proposals: CUFP 2009

Commercial Users of Functional Programming Workshop 2009:Functional Programming as a Means, Not an EndSponsored by SIGPLANCo-located with ICFP 2009Edinburgh, Scotland, 4 September 2009Galois is excited to promote this sixth annual event and encourages any interested in speaking at the workshop to send in a presentation proposal!  Whether you’d like to offer a talk yourself or you’d have someone in mind you’d like to nominate, please submit a proposal by 15 May 2009 via e-mail to francesco(at)erlang-consulting(dot)com or jim(dot)d(dot)grundy(at)intel(dot)com. Include a short description (approx. one page) of what you’d like to talk about or what you think your nominee should give a talk about.Do I have a presentation idea?If you use functional programming as a means rather than as an end (or could nominate someone who does), we invite you to offer to give a talk at the workshop. Talks are typically 25 minutes long but can be shorter and aim to inform participants about how functional programming plays out in real-world applications. Your talk does not need to be highly technical, and you do not need to submit a paper!What is the goal?The goal of the CUFP workshop is to act as a voice for users of functional programming and to support the increasing viability of functional programming in the commercial, governmental, and open-source space. The workshop is also designed to enable the formation and reinforcement of relationships that further the commercial use of functional programming.Tell me more!CUFP 2009 will last a full day and feature a keynote presentation from Bryan O’Sullivan, co-author of Real World Haskell. The program will also include a mix of presentations and discussion sessions varying over a wide range of topics.This will be the sixth CUFP;  for more information, including reports from attendees of previous events and video of recent talks, see the workshop web site: http://cufp.galois.com/.

Read More

Galois Tech Talk: Specializing Generators for High-Performance Monte-Carlo Simulation in Haskell

The March 10th Galois Tech Talks was delivered by Don Stewart on “Specialising Generators for High-Performance Monte-Carlo Simulation in Haskell.”Here are the slides.

  • Date: Tuesday, March 10, 2009
  • Time: 10:30am – 11:30am
  • Location: Galois, Inc.421 SW 6th Ave. Suite 300(3rd floor of the Commonwealth Building)Portland, OR 97204

Abstract: We address the tension between software generality and performance in the domain of simulations based on Monte-Carlo methods. We simultaneously achieve generality and high performance by a novel development methodology and software architecture centred around the concept of a specialising simulator generator. Our approach combines and extends methods from functional programming, generative programming, partial evaluation, and runtime code generation. We also show how to generate parallelised simulators.We evaluated our approach by implementing a simulator for advanced forms of polymerisation kinetics. We achieved unprecedented performance, making Monte-Carlo methods practically useful in an area that was previously dominated by deterministic PDE solvers. This is of high practical relevance, as Monte-Carlo simulations can provide detailed microscopic information that cannot be obtained with deterministic solvers.


Galois has been holding weekly technical seminars for several years on topics from functional programming, formal methods, compiler and language design, to cryptography, and operating system construction, with talks by many figures from the programming language and formal methods communities. The talks are open and free. An RSVP is not required, but feel free to contact the organizer with questions and comments.

Read More

Trustworthy Voting Systems

Accurate and reliable elections are a critical component of an effective democracy. However, completely secure and trustworthy voting procedures are difficult to design, and no perfect solutions are known. Ideally, a trustworthy voting system should guarantee both verifiability (the ability to prove that the counted vote matches the submitted ballots) and privacy (the inability to link the contents of a vote with the voter who cast it).These guarantees may now be achievable. Many researchers have proposed voting protocols that achieve verifiability and privacy in theory, and a few do so under assumptions that are satisfied by current election practices. Most of the protocols involve posting an encrypted version of the contents of every ballot in some public place (likely a web site), and depend on the properties of cryptographic operations to achieve privacy while allowing anyone to verify the final tally. Now that practical, secure voting protocols exist, the time has come to bring them into use. One existing solution that comes close to achieving these goals while retaining compatibility with current voting practices is the Scantegrity II system. It has the advantage that it can operate under current US election conditions, without requiring any modification to existing optical ballot scanners, and with very little change to the individual voting process. However, the software used in this system is only a prototype, with a number of shortcomings. Voter privacy depends on ability of a computer system to keep a key database completely secret, and accurate vote counting depends on the correct implementation of complex cryptographic algorithms. The software is tens of thousands of lines of code, and as with any other software of that size, many bugs certainly exist. We believe that the importance of trustworthy election results and the past lack of success in creating reliable solutions warrants a new approach to the design of voting systems. In particular, we advocate a class of techniques known as formal methods that allow us to make precise mathematical assertions about how software should behave, and determine whether it satisfies those assertions. Government agencies within the Department of Defense make use of formal methods to ensure the reliability of important computer systems, and the draft update to the development standards used by the Federal Aviation Administration, DO178C, includes provisions for the use of formal methods. Voting systems deserve similar care.

Read More

Equivalence and Safety Checking in Cryptol

The Cryptol language comes with an integrated verification tool-set that can automatically perform equivalence and safety checking on Cryptol programs. Recently, we have presented a paper on this topic at PLPV’09: “Programming Languages Meets Program Verification” workshop. (Slides are also available.)

Briefly, equivalence checking refers to the problem of proving that two functions have the exact same input/output behavior. Typically, these functions are versions of the same algorithm; one being a reference implementation and the other being an optimized version. Cryptol automatically establishes that the optimized version is precisely equivalent to the original. If the functions are not equivalent, Cryptol provides a counter-example where they disagree; aiding greatly in development/debugging.

Safety checking refers to the problem of proving that the execution of a function cannot raise any exceptions; such as division by zero; index out-of-bounds, etc. When the safety checker says that a function is safe, you will know for sure that such conditions will never arise at run-time. (Similarly, you will get a concrete counter-example from Cryptol if this is not the case.)

Cryptol uses symbolic simulation to translate equivalence and safety checking problems to equivalent problems using the bit-vector logic of SMT-Lib. Furthermore, Cryptol has built-in connections to several SAT/SMT solvers. It automatically calls these provers and presents the results to the user in original Cryptol terms; providing a seamless verification environment for the end-user.

The full paper and slides on equivalence checking in Cryptol are available for download.

Read More

A Cryptol Implementation of Skein

Following on from the MD6-in-Cryptol posting, let’s consider another very interesting candidate from the (deep) pool of SHA-3 submissions; Skein 

 http://www.skein-hash.info/ http://www.schneier.com/skein.html

by the merry band of Ferguson, Lucks, Schneier, et al.The expression of their reference implementation comes out, we think, fairly cleanly in Cryptol. The digest output size is a variable parameter to the algorithm, but we’ll focus on the 512-bit version here — the submission’s primary candidate for SHA-3.In order to avoid duplicating the introductory material on Cryptol, we suggest the reader go through the MD6 writeup to get a grounding in Cryptol, its idioms, and syntax.

Read More