Finding the First

I often need to write programs that follow a simple pattern:

  1. For each of a list of objects (e.g. IP addresses, filesystems), try operation 1
  2. On the first one that succeeds, perform operation 2
  3. If none succeed, perform operation 3

There’s probably actually a name for this kind of thing, it seems like it would be common enough to warrant one. Let’s see how we can do this in OCaml. First set up some initial conditions. The function success is operation 1, square is operation 2 and default is operation 3 in the preceeding list.

exception Result_found of int
let success x = match x with |5 -> true |_ -> false in
let square x = x * x in
let mylist = [1; 2; 3; 4; 5; 6; 7; 8; 9] in
let default = 10 in

First the most naive solution using a mutable variable. A variable is set to the default result, then all the items on the list are processed, if one succeeds then the variable is set to that, finally return whatever variable is set to at the end. This is a common method in shell script.

let d = ref default in
List.iter (fun x -> if success x then d := square x else ()) mylist;
print_endline (string_of_int !d);

The problems with this are obvious; firstly that it processes every list item regardless of whether it has already succeeded, and these may be time consuming or resource-intensive operations. Secondly it will return the last one, not the first. It is effectively just a foreach loop despite being expressed in the OCaml-ish List.iter syntax. So let us try something a bit better:

print_endline (string_of_int 
		 (try
		   square (List.hd (List.filter success mylist))
		 with
		   |Failure x -> default));

This does at least get the first one that succeeds with operation 1, however it still performs the test on each item. Let us see if we can combine these two approaches to return the first successful item without wasting any computation:

print_endline (string_of_int
		 (try
		    List.iter (fun x -> if success x then raise (Result_found x) else ()) mylist;
		    default
		  with
		    |Result_found x -> square x));

This works by breaking out of the loop by raising an exception (there is no native break/continue in OCaml tho’ it can be done by means of syntax extension). I don’t really like this approach; it is perfectly valid, and was considered the clever solution when I did Java way back when, but in my mind exceptions are a mechanism for dealing with conditions that are expected to happen only rarely and/or represent something going awry. And leaving the default dangling there is messy.

Finally, I believe this is the idiomatic solution, using a recursive function to work through the list, performing operation 3 if the list is empty, and passing the untried list members back to itself after an unsuccessful operation 1. If operation 1 succeeds, operation 2 is performed and that is the result of the function.

(* default, sucess condition, function, list *)
let rec try_each d c f xs =
  match xs with
    | []     -> d (* reached the end without success, return default *)
    | x::xs  -> match (c x) with
        | true  -> f x 
        | false -> try_each d c f xs in

print_endline (string_of_int (try_each default success square mylist))

Unlike the others this also abstracts out the three operations, making it fully reusable. It looks tantalizingly close to a fold (and hence a one-liner), but I can’t quite make the leap from here to there…

About Gaius

Jus' a good ol' boy, never meanin' no harm
This entry was posted in Ocaml. Bookmark the permalink.

10 Responses to Finding the First

  1. Daniel says:

    In Haskell that would be: try_each d c f xs = maybe d f (find c xs)

    To find the first element satisfying c use find :: (a -> Bool) -> [a] -> Maybe a if there is no such element you’ll get Nothing.
    Then maybe :: b -> (a -> b) -> Maybe a -> b handles the Maybe value and you are done.

    • Gaius says:

      There’s an option type in OCaml with Some a and None but I don’t think it’s the same thing as the Maybe or Either monads. Hmm.

      • gasche says:

        yes, “option” is the same as Haskell’s Maybe. It’s a monad as well, but the monadic operations are not in the standard library.

  2. Exceptions in OCaml are a very common way to program. The fact is that exceptiosn are significantly faster than in Java or C++ (it is almost like a “goto” in asm). So it is 1) not time consuming and 2) not a bad coding style in OCaml.

    May I suggest you this very idiomatic solution:

    let res =
    try
    square (List.find success mylist)
    with Not_found ->
    default

    The only advice concerning exception is to avoid adding a "try ... with ..." construct inside a tail recursive function (because it breaks tail recursion).

    • gasche says:

      I was going to suggest the use of List.find as well. Too slow !

      Note that the current solution is not quite right in case the post-processing steps (here square) may raise an exception : you would return default, while you probably rather want to raise that exception.

      The problem is that the “try” scope extends to the latter “in ..” part of the let-declaration, why we want it only in the “let = …” part. This has been discussed and a “let try” feature has been proposed by Nick Benton and Andrew Kennedy ( http://lambda-the-ultimate.org/node/1193 ).

      The usual OCaml workaround is to wrap the exception-raising “let = ..” in an option type:
      match
      (try Some (List.find success mylist)
      with Not_found -> None)
      with
      | None -> default
      | Some x -> square x

      Additionnally, Batteries proposes “Exceptionless” variants of the usual functions, that already return an option instead of raising an exception ( http://thelema.github.com/batteries-included/hdoc/BatList.Exceptionless.html ):
      match BatList.Exceptionless.find success mylist with
      | None -> default
      | Some x -> square x

    • Gaius says:

      Ah, very nice! If that is considered good style then I’ll adopt it.

      Back in the day in Java (we’re talking the 90s here) a trick for good performance was to rely on the VM checking everything you did anyway, e.g. when iterating over an array there was no need for a terminating condition comparing the current index to the length of the array in each iteration; just run off the end and let the VM raise ArrayIndexOutOfBoundsException and catch it and do nothing. I could see the benefits, but I never liked it, coming from C and FORTRAN before that. Java had nothing for doing this elegantly.

      Some versions of FORTRAN required knowing the size of all your arrays at compile time (!).

  3. bltxd says:

    Is there a reason you avoided the following ?


    let try_each d c f xs = try f (List.find c xs) with Not_found -> d

    • Gaius says:

      No good reason, just that I was thinking through the different approaches out loud. As m’sieur Le Gall says using exceptions this way is idiomatic, that or the option method is what I shall do in future! šŸ™‚

  4. Pingback: LDAP (3) | So I decided to take my work back underground

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s