Migrating Higher Order Programming to the Web

My application, HOPE (Higher Order Programming Environment) is, in its simplest definition, a semantic publisher-subscriber application builder.  I’ve been wanting to move it to the web, but this is fraught with security concerns, hence my investigations into Docker technology and using a language like Python for the programming of “receptors” — the autonomous computational units that process semantic data.  Another stumbling block is my lack of experience with HTML5 canvas and graphics rendering.  Regardless though, there was no reason not to put together a proof of concept.

Here’s a quick walkthrough — no, this code is not yet publicly available.

Step 1: create a few receptors.

We’ll do some computations based on inputting a birth date.

Receptor #1: computing the age of a person in terms of years and days:

r1.png

Notice the class name computeAge.  We’ll talk about this later.

Receptor #2: compute the number of days to the birth day:

r2.png

Note the class name daysToBirthday.

Receptor #3: Get the interesting people born on the same month and day:

r3.png

Again, note the class name personsOfInterest.

Step 2: Add the Receptors to the Surface Membrane

r4.png

Step 3: Inject a semantic JSON object

Run the “membrane” and inject:

{"birthday": {"year": 1962, "month": 8, "day": 19}}

r5.png

And here’s the result:

r6.png

The full output being:

[
 {
 "age": {
 "years": 37,
 "days": 204
 }
 },
 {
 "daysToBirthday": 161
 },
 {
 "personsOfInterest": [
 "1724 Samuel Hood, 1st Viscount Hood, British admiral in the American Revolutionary War and the French Revolutionary Wars, born in Butleigh, England (d. 1816)",
 "1745 John Jay, American statesman, 1st US Chief Justice, born in New York City",
 "1863 Edvard Munch, Norwegian painter and print maker (The Scream), born in Ådalsbruk, Løten, Norway (d. 1944)",
 "1915 Frank Sinatra, American singer (Strangers in the Night, My Way) and actor (From Here to Eternity) known as 'old blue eyes', born in Hoboken, New Jersey (d. 1998)",
 "1932 robert pettit, American NBA star (St Louis Bombers/1959 MVP), born in Baton Rouge, Louisiana"
 ]
 }
]

What’s Going On?

Simply put, a preprocessor creates a mapping between semantic types and receptors (Python classes):

receptorMap = 
{
  'birthday':[computeAge(), personsOfInterest()],
  'age':[daysToBirthday()]
}

and the semantic processor engine routes the JSON semantic types to the Python class receptor’s process method.  When you inject “birthday”, it routes that type’s data to the computeAge and personOfInterest receptors.  The output of computeAge, which is of type “age”, is routed to the daysToBirthday receptor.

In this manner, one can create a library of small computational units (receptors) and build interesting “computational stories” by mixing and matching the desired computations.  Creating the receptors in Python makes this approach perfectly suited for running in Docker containers.  My ultimate vision is that people would start publishing interesting receptors in the open source community.

There’s still much more to go, but even as such, it’s a fun prototype to play with!  Some of the interesting problems that come out of this is, how do we let the end user create a visual interface (a UI, in other words) that facilitates both intuitive input of the semantic data as well as displaying real time output of the semantic computations.  Just the kind of challenging stuff I like!

FlowSharpCode, continued…

fsc2.png

A simple example, but the “problem” is that the three Drakon shapes (begin loop, output, and end loop) each still have individual C# code-behind in each shape.  For example, the begin loop has the code-behind:

 var n in Enumerable.Range(1, 10)

My original idea was that the Drakon shape description should not define the language-specific syntax, instead that should be implemented by the developer in the code-behind.

In practice (and I’ve written a complex application in FlowSharpCode, so I know) it becomes unwieldy to deal with one-liner code behind, the result of which is that I tend not to use Drakon shapes, but that results in nothing better than a meaningless box with some code in it.

I’m also reluctant to put the code in the shape label (though this is supported) as again we’re now dealing with language specific syntax.

I’m also reluctant to create a meta-language for Drakon shapes, for example, something that could interpret:

n = 1..10

into C#, Python, whatever.  What if the developer wants to write:

Count from 1 to 10

So, what I’m considering is letting the developer create the Domain Specific Language (DSL) so that they can expressively communicate the semantics of a Drakon shape and also provide the rules for how the semantics is parsed, ideally in an intermediate language (IL), for example, something that expresses a for loop, a method call, whatever.

The advantage to this is that the developer can create whatever DSL they like to work in, the IL glues it together into the concrete language.

Two things happen then:

  1. The DSL is interchangeable.  Any IL can be super-composed into your DSL choice.
  2. The IL is language independent, so it can be de-composed into language specific syntax.

Item #2 of course imposes some significant limitations — what if a language doesn’t support classes, or interfaces, or yield operator, or whatever?  I’m not particularly too concerned about that as a language-independent DSL/IL is more of a curiosity piece, as it becomes rapidly untenable when your code starts calling language-framework-platform dependencies.

However, I’d love to hear my readers thoughts on this DSL/IL concept I’m considering.

 

 

Why Semantic Databases are so Important – Discovering Shared Meaning

An excerpt from my upcoming article that I wanted to share separately from the full article, as it is an excellent example of one reason semantic databases are so important — the ability to discover shared meaning.

Being able to discover implicit associations is one of the unique features of a semantic database.  Given three semantic types:

  1. Person
  2. Date
  3. MonthLookup

implemented with the following semantic schema:

threefoldSchema.png
Our “discovery” method determines the following associations (left-to-right is bottom-most to top-most):

month <- monthLookup
month <- date
  
name <- monthName <- monthLookup
name <- firstName <- personName <- person
  
name <- monthName <- monthLookup
name <- lastName <- personName <- person
  
name <- monthAbbr <- monthLookup
name <- firstName <- personName <- person
  
name <- monthAbbr <- monthLookup
name <- lastName <- personName <- person

From this, we could construct a query where we can say “give me all the dates whose month’s name is also the person’s first name”.  This would follow the association chain like this:

associations1

 

 

 

If we populate the month lookup semantic records with the obvious 12 months of the year, and the person an date semantic records with (flattened view here):

{month: 8, day: 19, year: 1962}
{month: 4, day: 1, year: 2016}

and (again, flattened view here):

{firstName: 'Marc', lastName: 'Clifton'}
{firstName: 'April', lastName: 'Jones'}

We can then write a MongoDB query, based on the discovered shared meaning:

db.firstName.aggregate(
// firstName -> name
{ $lookup: {from: ‘name’, localField: ‘nameId’, foreignField: ‘_id’, as: ‘fname’} },
{ $unwind: ‘$fname’ },
// name -> monthName -> monthLookup
{ $lookup: {from: ‘monthName’, localField: ‘fname._id’, foreignField: ‘nameId’, as: ‘monthName’} },
{ $unwind: ‘$monthName’ },
{ $lookup: {from: ‘monthLookup’, localField: ‘monthName._id’, foreignField: ‘monthNameId’, as: ‘monthLookup’} },
{ $unwind: ‘$monthLookup’},
// monthLookup -> month
{ $lookup: {from: ‘month’, localField: ‘monthLookup.monthId’, foreignField: ‘_id’, as: ‘month’} },
{ $unwind: ‘$month’},
// month -> date
{ $lookup: {from: ‘date’, localField: ‘month._id’, foreignField: ‘monthId’, as: ‘date’} },
{ $unwind: ‘$date’},
// date.day -> day.value
{ $lookup: {from: ‘day’, localField: ‘date.dayId’, foreignField: ‘_id’, as: ‘day’} },
{ $unwind: ‘$day’},
// date.year -> year.value
{ $lookup: {from: ‘year’, localField: ‘date.yearId’, foreignField: ‘_id’, as: ‘year’} },
{ $unwind: ‘$year’},
{ $project: {‘monthName’: ‘$fname.name’, ‘month’: ‘$month.value’, ‘day’: ‘$day.value’, ‘year’: ‘$year.value’, ‘_id’:0} } )

Giving us the one matching record.

fnamemonthdate

 

 

 

 

What we’ve achieved here is quite interesting!  Because our database is semantic, the system knows that things like “month” and “name” have a shared meaning, so we can ask the schema “what entities have shared meaning” and we can weave through the hierarchies of the semantic types to combine types into new and interesting queries.  This kind of query could of course be expressed in SQL (and perhaps more simply), but we would be comparing non-semantic field values that the programmer decided had shared meaning, rather than the system discovering the shared meaning.  By letting the system discover the shared meaning, the user, not the programmer, can make new and interesting associations.

Using MongoDB to Implement a Semantic Database – Part I

mongodbart

 

 

Newly published article on Code Project.

Semantic Database Technology (from InformationWeek):

Semantic technology has created a disruptive opportunity for businesses to obtain more value from their data. The concepts surrounding the semantic Web, such as linked data cloud and data mashups, are powered by a set of emerging standards and products that, for now, are mainly used for consumer services. However, these technologies are equally compelling as part of an enterprise data platform behind the firewall.
At a high level, there are five main benefits of semantic technology:
    > It works in tandem with your existing database investments;
    > It aligns with Web technologies;
    > It speeds the integration of multiple databases;
    > It’s based on data structures that are flexible by design; and
    > It can help enterprises tackle big data challenges.

Excerpts from my article:

Fundamentally, a semantic database captures relationships.  There are two primary kinds of relationships:

  1. Static, implicit relationships that define the structure (give meaning) to a semantic term (a symbol).  These are typically expressed with the same terms used in object oriented programming “has a” and “is a kind of.”
  2. Static or dynamic explicit relationships, where the relationship itself has a meaning expressed in a semantic term and where dynamic relationships can change over time.  In programming, these relationships are usually expressed implicitly in the code, for example, a dictionary or other key-value pair collections.  Dynamic relationships often have a time frame — a beginning and an ending.

The advantage of a NoSQL database is that the schema itself is dynamic:

  1. The structure of implicit symbols change (think of how names and addresses vary among cultures.)
  2. New symbols can be easily added (simply add a new collection.)
  3. New relationships between symbols can be easily added (simply add a collection with two fields associating the ID’s of two collections.)

Read the whole article here!

MongoDB Grows Up – the $lookup aggregator

mongodbartSo far, I’ve been avoiding anything having to do with NoSQL because of the inability to do, in the classical RDBMS world, table joins.  The idea of having to pull into memory (well, my application’s memory) all the results of one table and then join them (with more code) to another table was appalling to me, particularly since I want my queries to be runtime generated by metadata schema.

However, as of the release of MongoDB 3.2, that has changed!  I can now do multiple table (collection) joins, which in my opinion, is critical for working with semantic data.  As Semag points out:

“Data is organized based on binary models of objects, usually in groups of three parts: two objects and their relationship.”

So now, in MongoDB, I can do something very simple (and very non-semantic in this example):

db.createCollection("Person")
db.createCollection("Phone")
db.createCollection("PersonPhone")

db.Person.insert({ID: 1, LastName: "Clifton", FirstName: "Marc"})
db.Person.insert({ID: 2, LastName: "Wagers", FirstName: "Kelli"})

db.Phone.insert({ID: 1, Number: "518-555-1212"})
db.Phone.insert({ID: 2, Number: "518-123-4567"})

db.PersonPhone.insert({ID: 1, PersonID: 1, PhoneID: 1})
db.PersonPhone.insert({ID: 2, PersonID: 2, PhoneID: 1})
db.PersonPhone.insert({ID: 3, PersonID: 2, PhoneID: 2})

Notice I have to tables, Person and Phone (please ignore my case style, I come from a different world, some may say planet).  I can now query the data using the $lookup aggregate function (plus a few other pieces):

db.PersonPhone.aggregate([
{ $lookup: { from: "Person", localField: "PersonID", foreignField: "ID", as: "PersonName" } }, 
{ $lookup: { from: "Phone", localField: "PhoneID", foreignField: "ID", as: "PersonPhone" } }, 
{ $match: {PersonID: 2} }, 
{$project: {"PersonName.LastName":1, "PersonName.FirstName":1, "PersonPhone.Number": 1, _id:0}} ])

and I get a lovely resulting dataset:

{
  "PersonName": [
    {
      "LastName": "Wagers",
      "FirstName": "Kelli"
    }
  ],
  "PersonPhone": [
    {
      "Number": "518-555-1212"
    }
  ]
}{
  "PersonName": [
    {
      "LastName": "Wagers",
      "FirstName": "Kelli"
    }
  ],
  "PersonPhone": [
    {
      "Number": "518-123-4567"
    }
  ]
}

So now I can use finally MongoDB to satisfy the foundational tenet of a semantic database: a relationship between two objects.

And the nice thing about using a NoSQL database is that schema is not fixed in concrete, as it is with a SQL database, which would normally require additional layers of manipulation to deal with the second foundational tenet of a semantic database: that your concept of, say, a person’s name might be semantically different (but still compatible) with mine — especially when we consider culture.