Compartilhar via


F# Language Details (Gotchas)

The ‘F’ in F# stands for fun. However, there are some details in F# that might lead to bugs, surprises, and/or un-fun. This post is to highlight a couple of random ‘gotchas’ when exploring some corners of the F# language.

Unicorn Image #13260

 

Overriding Equals and Not Equals

F# allows you to overload operators so you can better map mathematical concepts to your code. For example, consider this simple vector type. Notice how it overrides the Equals operator to compare just the vector’s magnitudes, not their actual values.

 type Vector =
    | Vector of float * float * float

    member this.Length =
        // Extract values via ninja pattern match
        // hidden in a let binding
        let (Vector(x,y,z)) = this
        sqrt <| x * x + y * y + z * z

    static member (+) (lhs, rhs) =
        let (Vector(x1,y1,z1)) = lhs
        let (Vector(x2,y2,z2)) = rhs

        Vector(x1 + x2, y1 + y2, z1 + z2)
        
    // Equals
    static member (==) (lhs : Vector, rhs : Vector) =
        lhs.Length = rhs.Length

    // Not equals
    static member (!=) (lhs : Vector, rhs) =
        not (lhs == rhs)

When you run this code in FSI, it doesn’t work.

 > // Bug?
let vec1 = Vector(10.0, 0.0, 0.0)
let vec2 = Vector(0.0, 0.0, 10.0);;

val vec1 : Vector = Vector (10.0,0.0,0.0)
val vec2 : Vector = Vector (0.0,0.0,10.0)

> vec1 = vec2;;
val it : bool = false
> vec1.Length = vec2.Length;;
val it : bool = true

The key here is to understanding a bit more about how operator overloading in F# and .NET works. When you override an operator such as (+) the compiler generates a method named op_Addition. So when a .NET compiler sees “x + y” it looks for a method with that name. Similarly it loops for op_Subtraction, op_BitwiseOr, etc.

Since F# allows you to define symbolic operators (functions with symbols for names) you can overload operators that aren’t valid in C# or VB.NET, for example (==>):

 // Rotate the vector about the X axis
static member (==>) (vec : Vector, rads : float) =
    let (Vector(x,y,z)) = vec
    
    let x' = x * cos rads - y * sin rads
    let y' = x * sin rads + y * cos rads
    let z' = z

    Vector(x', y', z)

 > let v = Vector(10.0, 0.0, 0.0);;

val v : Vector = Vector (10.0,0.0,0.0)

> v ==> System.Math.PI;;
val it : Vector = Vector (-10.0,0.0,0.0)
 > v ==> (System.Math.PI / 2.0);;
val it : Vector = Vector (0.0,10.0,0.0)

While F# recognizes these symbolic operators, to call them from C# you need to use the ‘long name’ which in the previous example would be op_EqualsEqualsGreater.

To override the equals operator in C# requires that you define (==), which in turn generate a method called op_Equality. However, defining (==) in F# generates a method called op_EqualsEquals.

To properly overload the equals and not equals operators in F# you must use the F# operators for equality (=) and (<>).

 // Equals
static member (=) (lhs, rhs) =
    lhs.Length = rhs.Length

// Not equals
static member (<>) (lhs : Vector, rhs) =
    not (lhs == rhs)

Improper use of Range Expressions

Consider the following two loops:

 for i in [1 .. 10000000] do
    ...

for i in 1 .. 10000000 do
    ...

They may not look that different, but they have dramatically different performance profiles. You can find these differences detailed in the recently updated F# Language Specification. Both are examples of range expression, which is a way to create a sequence between to indexes. However, the subtlety here is that the first generates a full list, while the second is just the raw sequence.

So while one requires ten million memory allocations to generate the list, the second loop doesn’t. More importantly, the compiler can optimize the second loop into a standard for loop over a single integer. (And not need to do any memory allocation at all!)

Comments

  • Anonymous
    November 09, 2009
    Here's a compile time gotcha: let addLengths (strings:seq<string>) = Seq.sumBy (fun s -> s.Length) strings;; doesn't compile, but after rearranging does: let addLengths (strings:seq<string>) = strings |> Seq.sumBy (fun s -> s.Length);; This is undoubtedly complete conform spec, but as it seems so unlogical, I fall for it regularly,

  • Anonymous
    November 09, 2009
    I find your equality example a bit confusing... you're just talking about how to define an override that works from C#, right?  As I understand it, the way to customize the (=) operator's behavior in F# is to override obj.Equals rather than to define a static operator, so your updated op_Equality definition will still not result in (vec1 = vec2) evaluating to true from within FSI.

  • Anonymous
    November 10, 2009
    Brilsmurf, I agree 100%. That is an easy thing to get burned by. It's even worse because the Visual Studio language service knows the 'type' of the object if you hover your mouse over it but you still get a compiler error. As for Keith's comment: the F# (=) operator calls the .Equals method you can override from System.Object OR calls the op_Equality operator should it be defined. You should prefer to override .Equals instead of overriding the (=) operator. In fact, the F# compiler emits a warning to this tune. My intent was to point out that in the case you did want to override the (=) operator you shouldn't override the (==) operator like in C# but instead the (=) operator.   Does that clear things up?

  • Anonymous
    November 10, 2009
    now I'm confused, so, in F# you use "==" to compare, but if you want to override it you override =? as for the non equality, you use "not (a == b)" to compare, but override it with <>?

  • Anonymous
    November 10, 2009
    Brilsmurf: Just keep repeating "top to bottom, left to right" and soon you won't be surprised :). Yea, it was a slightly misleading example - why would you override == and != -- they aren't F# operators.

  • Anonymous
    November 16, 2009
    Chris- You might also want to make note of the sequence expression syntax instead of just the for .. in .. do syntax. Eg. let valuesAsSeq = {0..100} vs. let valuesAsList = [0..100] Most of the F# samples that I see use the later, which may or may not be the best one. I'm not sure how it impacts compiler optimizations, though. -Chris