LINQ does Logic? Chapter 2: Conjunctions

In the prior chapter in this series, we investigated sample 3 of Feraudy’s SProlog (resurrected) and recapitulated the sample in LINQ and in an updated version of Maeder’s Prolog-in-Mathematica (let’s call this MProlog from now on).

This time, we:

  1. Add an additional condition to the rule “can-float-on-water” to exclude materials that are water soluble (!) and write it in all three implementations.
  2. Create a new Mathematica version that emulates LINQ more closely, so that we have a pathway, in Mathematica, from full Prolog to something more LINQ-like. The purpose here is to give us an intellectual bridge to the C# world, LINQ’s homeland.

In a later post, we’ll discuss conjunctions in general, because adding a condition to the rule highlights that the rule is, logically, a conjunction — a sequence of terms connected by AND. Turns out this is foundational to Prolog and something we need to keep in mind while building up our LINQ capability.

The result of (consult "SPR/EXAMPL3.SPR") and then (findall Thing (can_float_on_water Thing) List) is the list

List = (ethanol olive_oil petroleum almond_wood apricot_wood apple_wood walnut_wood maple poplar balsa fat)

This looks ok except for the ethanol. Yes, it’s specific gravity is 0.789, less than 1.0, so it does satisfy (rless Density 1.0) just as we wrote. But ethanol is water soluble, and I think no matter how carefully you pour it on top of water, in the manner of a pousse-café, it’s not possible to float a layer of ethanol on water due to polar attraction at the molecular level.

So we didn’t write the program wrong, we wrote the wrong program.

Let’s fix this. First, add a few facts to the fact base:

(water_soluble water)
(water_soluble sea_water)
(water_soluble sugar)
(water_soluble ethanol)
(water_soluble milk)

and a new rule

((can_float_on_water Thing )/* head */
    /* if the following are true */
    (density Thing Density)
    (rless Density 1.0)
    (not (water_soluble Thing))

This time, when we run it, we get the List minus ethanol (in a new SProlog session — SProlog can’t handle consulting the file a second time)

List = (olive_oil petroleum almond_wood apricot_wood apple_wood walnut_wood maple poplar balsa fat)

Changes to the Mathematica version are almost identical:

Assert[waterSoluble[water]];
Assert[waterSoluble[seaWater]];
Assert[waterSoluble[sugar]];
Assert[waterSoluble[ethanol]];
Assert[waterSoluble[milk]];

Assert[canFloatOnWater[thing_],
    density[thing_, d_],
    d_ < 1.0,
    not[waterSoluble[thing_]]]

as are the results

{thing->oliveOil}
{thing->petroleum}
{thing->almondWood}
{thing->apricotWood}
{thing->appleWood}
{thing->walnutWood}
{thing->maple}
{thing->poplar}
{thing->balsa}
{thing->fat}

As with SProlog, we must restart the Mathematica session (Quit Kernel; Evaluate Notebook). This lends some evidence that both implementations, when reconsulting the facts and rules, are just adding new copies of the old information to the existing fact-and-rule base, and then they go a bit haywire when evaluating the queries. This is something to add to our list of things to look into, but, since we have an acceptable workaround, we won’t bother with it now.

What we will do is see if we can write the Mathematica version without all the MProlog machinery. This will look just like the LINQ one in C# and we’ll interleave the development of LINQ in Mathematica with the development of our query in LINQ in C#. So we’re going to go from SProlog to MProlog to MLINQ and C#-LINQ.

The first thing is to capture the fact base in a straight list-of-rewrite-rules. This requires us to choose permanent names for the attributes of the Prolog predicates. Prolog lets you create new variables to name the attributes every time you do a query. If you say

QueryAll[density[x_, y_]]

That means that, this time, you’d like to call the first slot of any density fact — the slot containing the name of a material — by the name x and the second slot of any density fact — the slot containing the value of the specific gravity — by the name y. If you say

QueryAll[density[foo_, bar_]]

it means that you’d like to call them foo and bar this time, in the context of this query.

In the LINQ-ish version, whether written in Mathematica or in C#, we don’t get to do this. In C# LINQ-ish, we’re mapping Prolog predicates to C# objects of classes and Prolog attributes to C# properties of objects, and we must pick the names of properties statically in C#. Likewise, in Mathematica LINQ-ish, we’re mapping Prolog facts to Mathematica lists of rewrite rules (which, as we shall see, are a great way to model objects in Mathematica). For the application of these rules, we will really need the attribute names — the patterns to match in the rewrite rules — to be statically known. If we need to rename the attributes for a different application — if we need to be more Prolog-ish in our use of C# or Mathematica, we can do it by defining new classes or lists of rules.

The point is: the need to pick attribute names more-or-less permanently is a real difference between Prolog-ish and LINQ-ish, but it doesn’t seem to be a practical limitation or problem. We’ll keep this in mind as we explore to see if there are deeper meanings or consequences as we go on.

Now, to harvest the factbase into rewrite rules in Mathematica, we will use its beautiful Sow and Reap primitives. We notice that Prolog’s interactive control paradigm — an initial Query followed by any number of Agains — is 100% imperative programming! This is ironic from Prolog, the “mother of all declarative programming languages,” but the resolution of the paradox is that Prolog is declarative in its fact-and-rules definitions, but definitely not — even in pretense — in its interaction model. Contrast this with Haskell, which must introduces monads to address interactivity in a declarative manner. There is much theoretical benefit to mine, there, but later. Back to harvesting the data in Mathematica — the give-away of imperative programming is the presence of For:

Reap[
  Module[{testVar},
   For[
    testVar = Query[density[material_, value_]],
    testVar =!= No,
    testVar = Again[],
    Sow[testVar]]]] [[2, 1]] // gridRules

producing:

using the gridRules utility from here.

We see that each density is a list of rewrite rules, one rule for rewriting the symbolic constant material and another rule for rewriting the symbolic constant value. For instance, one of the facts is {material -> seaWater, value -> 1.03}. To fetch the material out of this rule, write material /. theRule. That’s the general scheme: to use the “values” of the “properties” in a rewrite rule, just apply the rule to a symbolic expression. It should be obvious that this is equivalent to property access from an object, where we would write theObject.material to fetch the value of the material property, only more general. With a C# object, only a property name can go to the right of the dot. With a rewrite rule, an arbitrary expression can go to the left of the slashdot /. operator, (slashdot is the infix operator for ReplaceAll.

This harvesting code above looks almost exactly like what we would write in C#, bypassing the syntactic sugar of foreach:

var query = densities().GetEnumerator();
var reaper = new List();
for (var testVar = query.MoveNext();
    testVar;
    testVar = query.MoveNext())
    {
        reaper.Add(query.Current); // this is Mathematica's Sow!
    }
reaper.Dump();

producing:

Of course, in C# this is the same as calling .ToList() on the IEnumerable input, but in Mathematica, this teaches us that Prolog’s Query[...] is a combination of C#’s GetEnumerator and one call of C#’s MoveNext, and that Prolog’s Again[] is a combination of one call of MoveNext and one conditional call of Current. This realization suggests that we could rewrite SProlog in C# with LINQ very nicely, but that is another digression: we’re trying to use LINQ straight-up to solve the same kinds of problems that Prolog solves.

Let’s use Sow and Reap as above to pull all the density facts into a function named densities that returns a list, just as in C# we pulled them all into a function named densities that returns an IEnumerable<Density>; and likewise pull all the waterSoluble facts into a function named waterSolubles that returns a list. We lose C#/LINQ’s laziness at this point, but there are a number of ways for us to put it back, later. As already mentioned, though, we are giving up — very much on purpose and without a known need for a fix later — the freedom to name attributes arbitrarily.

Mathematica’s Select is LINQ’s Where. Let’s define the first of a namespace of LINQ-ish operators in Mathematica, then, as follows:

Linq`Where = Select

and now we can write our original query as

Linq`Where[densities, (value /. #) < 1.0 &]

This term:

(value /. #) < 1.0 &

is a lambda expression with positional parameters in Mathematica. The ampersand at the end tells us so, and the hash-mark in the middle is the sole parameter of the expression. If there were more, they would be denoted #1, #2, and so on. So this entire expression means “filter densities with the following lambda expression as Boolean predicate.” We could write it in postfix form as follows

densities // Where[#, <predicate-lambda-expression>]&

We’ve converted the Where call to another lambda expression and applied it via the Postfix call operation that, amusingly enough, is represented by the infix operator // (slashslash is the infix operator for Postfix!)

This is a lot of non-sugary syntax, but it’s actually quite close to what we would write in C#’s fluent syntax, which is exactly equivalent to what we wrote at the end of Chapter 1 of this series, only then in query-comprehension syntax:

results = densities()
    .Where(d => Convert.ToDouble(d.Value) < 1.0)
    ;

Staying with C#, now, since we’ve reoriented there, we may add our next condition in the form of checking that each material that passes the first filter is not a member of another list made of Selecting the material properties out of the water_solublescollection:

results = densities()
    .Where(d => Convert.ToDouble(d.Value) < 1.0)
    .Where(d => !water_solubles()
        .Select(w => w.Material)
        .Contains(d.Material))
    ;

This is not efficient, as it scans the water_solubles list for each material. It has avoidable quadratic complexity, and we’ll fix that later. At least it is a crystal-clear rendition of the original Prolog query, with lots of syntactic parallelism, and some syntactic gargoyles, which we will also fix. In comprehension syntax, this looks like the following:

results = from d in (
                   from d in densities()
                   where Convert.ToDouble(d.Value) < 1.0
                   select d)
               where !(from w in water_solubles()
                       select w.Material).Contains(d.Material)
               select d
               ;

In comprehension syntax, the nesting is unnatural. Let’s pull the first condition into the outer Where, connected with a && conjunction operator, so we have finally gotten around to justifying the title — Conjunctions — of this chapter:

results = from d in densities()
               where Convert.ToDouble(d.Value) < 1.0
                 && !(from w in water_solubles()
                      select w.Material).Contains(d.Material)
                  select d
                  ;

There is no difference in semantics or performance to the chained form in fluent syntax or the nested form in comprehension syntax.

To finish off this chapter, here are the equivalents in non-Prologuish Mathematica, introducing lambda expressions with named parameters via the \[Function] arrow, first in the nested form:

densities[] // 
  Linq`Where[#, d \[Function]
     (value /. d) < 1.0] & //
 Linq`Where[#, d \[Function]
    Not@((waterSolubles[] // 
         Linq`Select[#, x \[Function] material /. x] &) //
       Linq`Contains[#, material /. d] &)
   ] &

and in conjunctive form

densities[] // 
 Linq`Where[#, d \[Function] (value /. d) < 1.0 &&
     Not@((waterSolubles[] //
          Linq`Select[#, x \[Function] material /. x] &) //
        Linq`Contains[#, material /. d] &)] &

In those samples, we use the positional-parameter form of lambda expression — with hash marks and ampersands — only to support postfix notation, so that we may have as close a parallel to C#’s fluent syntax as we can, though no apologies for Mathematica‘s very crunchy operator syntax.

We’ve shown that we can easily migrate implicit conjunctive forms from Prolog to LINQ and that we can promote them to explicit conjunctive forms with the && operator. In the offing, we have exposed a number of important questions — and that’s what we’re supposed to be doing in this blog, namely

  1. how to restore laziness?
  2. is the need to pick attribute names permanently a real problem?
  3. whether and how to fix the syntactic gargoyles
  4. how to fix the avoidable quadratic complexity?
  5. what are the differences between explicit conjunctions (with &&) and implicit conjunctions (as in chained LINQ Wheres or just listing them in Prolog)

~ by rebcabin on August 31, 2012.

3 Responses to “LINQ does Logic? Chapter 2: Conjunctions”

  1. All code samples available at https://github.com/rebcabin/LINQdoesLogic

  2. […] In Chapter 2: Conjunctions, we built LINQ and Mathematica examples that check whether materials float on water by looking up densities and solubilities. The running times of these programs over the tiny sample data set were negligible, but we noted that the method used would not scale. That method used the .Contains extension method on IEnumerable and the MemberQ primitive on Lists to scan the entire list of solubles for each candidate material. […]

  3. […] represent the fact base the way we did in earlier chapters and scripts, only this time using anonymous types and C# dynamic, for reasons that become clear […]

Leave a reply to LINQ does Logic? Chapter 4: Composable Hashing « The Light Cone (Brian's space-time) Cancel reply