Strongly Typed vs. Loosely Typed Rules Engines
In a lot of respects, rules engines remind me of programming languages. One way this is especially obvious is how strongly typed the variables are in the data universe. For example in some programming languages (and rules engines) you simply state that a variable exists but you don’t really refer to its size or type. I saw this in some XML based rule engines and it reminds me of JavaScript or PHP. Other rules engines like Experian’s PowerCurve require that you state the type and size of variables. This reminds me of the schema definition of a relational databases or any of the strongly typed languages like C#, C, etc.
What are the pros and cons of these approaches and is there even a “best” way to do things?
Let’s talk about strongly typed rules engines first.
Pros
Anyone looking at the data dictionary can tell the size and type of a variable, it acts like a kind of documentation.
It allows trapping of “some” errors at design time vs. runtime. It is obviously better to catch errors at design time than at run time.
Cons
I worked with one customer who returned all rule engine inputs via an API. He found it pretty annoying that we had to know the size of every single variable (especially some of the text strings).
You can’t overload a variable and treat it as two different types (or at least it’s harder to do this). While I never came across the situation I can imagine a case where an API might return different object types for a given call. With a strongly typed rules engine I don’t really have a simple way to do this (especially if I don’t have a pointer type or casting abilities).
Loosely typed rules engine.
Pros
Just state that a variable exists or just use it directly in a rule presuming it came from the input JSON or XML.
Don’t have to predefine everything and it’s probably easier to maintain as the JSON or XML can change and you don’t have to match every little change in the strongly typed data dictionary.
Cons
Less ability to pick up errors at design time vs. runtime.
You lose the documentation effect of having a data dictionary.
You probably need to build a mechanism to trap for erroneous inputs (i.e. you expect a text code to be 10 characters and you get a gigabyte). Maybe the API using metadata will handle this but you need it to happen somewhere.
So what’s the right answer, a strongly typed rules engine or not? What if we gave the user a choice? Allow the user to define rules that directly access the variables without them being predefined first. So at a base level I’d allow the user to use the rules engine loosely defined. I’d then allow the user to additionally define the type and size of some variables (or all of them if they choose). This strong typing would help with design time error detection. Maybe this would give the best of both worlds. Feel free to reach out to me to discuss; I think it’s an interesting topic.