Beautiful Parallelism: Harnessing Multicores with Haskell

Don will be giving a talk SC’08 in Austin, Texas on Monday 17th November, as part of the Bridging Multicore’s Programmability Gap workshop (see the schedule here), talking about programming mainstream multicore systems with Haskell, now. Here’s the abstract,

Haskell is a general purpose, purely functional programming language. If you want to program a parallel machine, a purely functional language such as Haskell is a good choice: purity ensures the language is by-default safe for parallel execution, (whilst traditional imperative languages are by-default unsafe).This foundation has enabled Haskell to become something of a melting pot for high level approaches to concurrent and parallel programming, all available with an industrial strength compiler and language toolchain, available now for mainstream multicore programming.In this talk I will introduce the features Haskell provides for writing high level parallel and concurrent programs. In particular we’ll focus on lightweight semi-explicit parallelism using annotations to express parallelism opportunities. We’ll then describe mechanisms for explicitly parallel programs focusing on software transactional memory (STM) for shared memory communication. Finally, we’ll look at how Haskell’s nested data parallelism allows programmers to use rich data types in data parallel programs which are automatically transformed into flat data parallel versions for efficient execution on multi-core processors.

See Simon Peyton-Jones and Satnam Singh’s recent tutorial for more background on multicore Haskell, on which this talk is based.

Read More

Galois awarded NASA research contract

NASA has awarded Galois, Inc. together with the National Institute of Aerospace (NIA), a research contract to investigate monitor synthesis for software health management (here is NASA’s press release). The research team includes myself, Lee Pike as the Principal Investigator,  Cesar Munoz as the Co-PI (NIA), and Alwyn Goodloe as a Research Scientist (NIA). The award runs through the end of 2011, and we are investigating the formal synthesis of online monitors from requirements specifications. The research will focus on safety properties and real-time properties of distributed systems. Here are some slides I gave as part of an invited panel kicking off the project, and here’s the press release from Reuters. If you’re interested in finding out more about the research or are interested in collaborating, don’t hesitate to contact me, or leave a comment!

Read More

Factor: an extensible interactive language

A special Galois tech talk next week, with Slava Pestov talking aboutthe Factor programming language, environment, and implementation.

Logistics

  • Thursday Oct 30, 2008, 10.30 – 11.30

Abstract

Factor is a programming language which has been in development for a little over 5 years. Factor is influenced by Forth, Lisp, Smalltalk. Factor takes the best ideas from Forth — simplicity, short, succint, code, emphasis on interactive testing, and meta-programming. Factor also brings modern high-level language features such as garbage collection, object orientation and functional programming familiar to users of languages such as Lisp, Smalltalk and Python. Finally, recognizing that no programming language is an island, Factor is portable, ships with a full-featured standard library, deploys stand-alone binaries, and interoperates with C and Objective-C.In this talk, I will give the rationale for Factor’s creation, present an overview of the language, and show how Factor can be used to solve real-world problems with a minimum of fuss. At the same time, I will emphasize Factor’s extensible syntax, meta-programming and reflection capabilities, and show that these features, which are unheard of in the world of mainstream programming languages, make programs easier to write, more robust, and fun.

Galois has been holding weekly technical seminars for several years on topics from functional programming, formal methods, compiler and language design, to cryptography, and operating system construction, with talks by many figures from the programming language and formal methods communities.

Read More

Type Correct Changes: A Safe Approach to Version Control Implementation

Tuesday’s tech talk was a special treat, with Jason Dagit dropping by to talk about using GADTs to clean up the darcs patch theory implementation. (.pdf slides).Patch TheoryLogistics

  • Tuesday, Oct 14, 2008, 10.30 – 11.30

Abstract

Darcs is based on a data model, known as Patch Theory, that sets it apart from other version control systems. The power of this data model is that it allows Darcs to manage significant complexity with a relatively straightforward user interface.darcsWe show that Generalized Algebraic Data Types (GADTs) can be used to express several fundamental invariants and properties derived from Patch Theory. This gives our compiler, GHC, a way to statically enforce our adherence to the essential rules of our data model.Finally, we examine how these techniques can improve the quality of the darcs codebase in practice.

Galois has been holding weekly technical seminars for several years on topics from functional programming, formal methods, compiler and language design, to cryptography, and operating system construction, with talks by many figures from the programming language and formal methods communities.

Read More

Tech Talk: The Future of Cabal

Duncan Coutts, from Well-Typed, gave a tech talk on Tuesday, 7th Oct, 10.30am, at Galois, about the technical direction of Cabal, Haskell package infrastructure, and the problems of managing large volumes of Haskell code. (.pdf slides).

The Future of Cabal

This will be an informal talk and discussion on two topics:AbstractA language for build systemsBuild systems are easy to start but hard to get right. We’ll take the view of a language designer and look at where our current tools fall down in terms of safety/correctness and expressiveness.We’ll then consider some very early ideas about what a build system language should look like and what properties it should have. Currently this takes the form of a design for a build DSL embedded in Haskell.Constraint solving problems in package deploymentWe are all familiar, at least peripherally, with package systems. Every Linux distribution has a notion of packages and most have high level tools to automate the installation of packages and all their dependencies. What is not immediately obvious is that the problem of resolving a consistent set of dependencies is hard, indeed it is NP-complete. It is possible to encode 3-SAT or Sudoku as a query on a specially crafted package repository.We will look at this problem in a bit more detail and ask if the right approach might be to apply our knowledge about constraint solving rather than the current ad-hoc solvers that most real systems use. My hope is to provoke a discussion about the problem.

Read More

Bluespec SystemVerilog

Rishiyur Nikhil, CTO of Bluespec, Inc. will be giving a tech talk on Thursday, October 2nd, at 10.30am, about Bluespec SystemVerilog.Abstract

Over the past few years, several projects in major companies have been adopting BSV (Bluespec SystemVerilog) as their next-generation tool of choice for IP design, modeling (for both architecture exploration and early software development), and verification enviroments.The reason for choosing BSV is its unique combination of:

  1. excellent computation model for expressing complex concurrency and communication, based on atomic transactions and atomic transactional inter-module methods
  2. very high level of abstraction and parameterization (principally inspired by Haskell)
  3. full synthesizability, enabling execution on FPGAs, obtaining better performance (3 to 4 orders of magnitude) and scalability than software simulation at comparable levels of detail.

In this presentation, I will provide a brief technical overview of BSV (points 1-3 above), and describe several customer projects using BSV. I will also briefly contrast BSV with other approaches to High Level Synthesis (particularly those based on C/C++/SystemC).

Read More

Update: Bike Commute Challenge

With only a week to go in the 2008 Bike Commute Challenge, it’s looking as if Galois will pass its 2007 results. Last year (PDF), 17.1% of our September commmutes were by bicycle. This year, our commute-by-bike rate is 19.1%.N.B. If last year’s statistics (PDF) hold true for this year, Galois employee Sigbjorn Finne will finish in the Top 10 Riders By Distance category, and most likely in the top five.Friday, Sept. 26 update: Folks must have caught up on their riding logs, because the Galois commute rate has risen to 21.6%!

Read More

Galois @ ICFP: See you there!

ICFP is next week, and as usual, Galois will be involved, sponsoring workshops, chairing sessions, presenting papers, and generally talking to people about functional programming and the future. We’re particularly excited about the expanded Haskell Symposium, the line-up for the Commercial Users of Functional Programming, and the all-new DEFUN developer tracks on functional programming (watch Oleg hack live!).If you want to catch up, keep an eye out for Andy, Don, Eric, Iavor, Joe, Joel, John, Levent, Magnus and Trevor, or follow us on Twitter. Happy hacking!

Read More

Parsing the Linux kernel with Haskell: experience with Language.C

At Galois, Aaron Tomb has been experimenting with the new Haskell Language.C libraries recently (a Summer of Code project by Benedikt Huber, mentored by a Galois engineer, Iavor Diatchki), and he’s been impressed by what it can do. Here are his thoughts on parsing the Linux kernel with Haskell, with an eye to future static analysis work:My interest in the library is for use in static analysis of very large bodies of legacy C code, which means two issues matter a lot to me: 1) rock-solid support for all of GCC’s numerous extensions, and 2) speed. I have used CIL, and tools based on CIL in the past, but have been disappointed with its speed.As a simple scalability and robustness experiment, I decided to see how well Language.C would do on the Linux source tree. It doesn’t yet have an integrated preprocessor (depending on GCC’s for now), but I happened to have an already-preprocessed set of sources for Linux 2.6.24.3 sitting around (configured with defconfig).Could Language.C handle the Linux kernel?I wrote a little wrapper around the C parser to essentially just syntax-check all of the code.

import Language.Cimport Language.C.System.GCCimport System.Environmentprocess :: String -> IO ()process file = do putStr filestream <- readInputStream fileputStr (take (20 - length file) $ repeat ' ')either print(const $ putStrLn "Pass")(parseC stream nopos)main :: IO ()main = dofiles <- getArgsmapM_ process files

It prints the filename followed by “Pass” if the parse succeeds, or details about a syntax error if the parse fails. When I ran this on the Linux code mentioned above, I was amazed to find that it processed it all successfully! All 18 million lines of pre-processed source without a hitch.Since I also care about speed, I wanted to compare it with GCC. GCC has a handy flag, -fsyntax-only, which tells it to just check the syntax of the input file and quit. I ran both the Language.C wrapper(compiled with GHC 6.8.3 and the -O2 option) and GCC on all that code, on a 2.2GHz/4GB MacBook Pro. The result: Language.C parsed all of the code in about 6 minutes, while GCC managed it in a little over 2. GCC is still faster, but I’m happy to take a 3x speed hit for the benefit of being able to write all the subsequent analysis in Haskell.The following table shows the precise time and memory statistics for Langugage.C and GCC, both on the entire source collection and on the single largest file in the tree, bnx2.i, the driver for the Broadcom NetXtreme II network adapter. For the Language.C tests, I compared the performance when the garbage collector used 2 generations (the default) to 4 generations (specified with the +RTS -G4 option). Increasing the number of generations helped slightly.

User Time System Time Memory Use
L.C, all 5:59 0:09 144MB
L.C, all, -G4 5:27 0:08 131MB
L.C, bnx2.i 0:02.15 0:01.89 133MB
L.C, bnx2.i, -G4 0:01.96 0:01.76 85MB
gcc, all 2:02 0:17 ????
gcc, bnx2.i 0:00.56 0:00.07 33MB
Read More