Action ML Logo

Table of contents

UR Input

The Universal Recommender input is formed of what are called "indicators" because you expect them to indicate something about user's preferences. They provide some bit of information about user behavior, preference, or attribute. Indicators come in two types that have very different uses:

  1. User indicators These are always tied to users and often indicate some user action, though they are not limited to actual events and can be extended to user profile properties and usage context. A purchase, video view, article read, search term, share, location, and many other things can be used as indicators.

    • Primary indicators: This is required by the UR algorithm. It is a record of the type of thing you want to recommend—buy, play, watch, read, etc. This should include the items from which you will recommend.

    • Secondary events: These encode anything else we know about the user that we think may be an indicator of taste—category or genre preference, location, pageviews, location, user profile data, tag preference, likes, follows, shares, even search terms.

  2. Item property changes are always tied to items and are used to set, change, or unset properties of items. User properties are not supported with property change events only through preference indicators/usage events.

The UR may have many indicators as input. These should be seen as a primary indicator, which is a very clear indication of a user preference and several secondary indicators, which we think will tell us something about user "taste" in some way. The Universal Recommender is built on a distributed Correlated Cross-Occurrence (CCO) Engine, which basically means that it will test every secondary indicator to make sure that it actually correlates to the primary one and those that do not correlate will have little or no effect on recommendations.

An example of this in practice was presented on the IBM DevWorks site in Multi-domain predictive AI or how to make one thing predict another

Preference Indicators aka Usage Events

Events in Harness are used to encode indicators and item property changes. They match Apache PredictionIO so that exporting from PIO will produce files that can be imported into Harness + UR.

Events are JSON objects that encode data sent to the REST input API. The UR interprets them as either Indicators or Item Property Change Events.

An ECommerce Primary Indicator might look like this:

{
   "event" : "purchase",
   "entityType" : "user",
   "entityId" : "John Doe",
   "targetEntityType" : "item",
   "targetEntityId" : "iPad",
   "properties" : {},
   "eventTime" : "2015-10-05T21:02:49.228Z"
}

Rules for Indicators are:

This is what a "purchase" event looks like. Note that a usage event always is from a user and has a user id. Also the "targetEntityType" is always "item". The actual target entity is implied by the event's "event" attribute. So to create a "category-preference" event you would send something like this:

{
   "event" : "category-preference",
   "entityType" : "user",
   "entityId" : "John Doe",
   "targetEntityType" : "item",
   "targetEntityId" : "electronics",
   "properties" : {},
   "eventTime" : "2015-10-05T21:02:49.228Z"
}

This event might be sent when the user clicked on the "electronics" category or perhaps purchased an item that was in the "electronics" category. Note again that the "targetEntityType" is always "item".

Set or Change Item Properties

Certain Event Names are reserved for special purposes that apply to any Engine attached to Harness. Theses are used by the UR to create, update, or delete Item Properties. They will always be targeted at item-ids and have the side-effect of creating an item object in the store.

To attach properties to items use a "$set" event like this:

{
   "event" : "$set",
   "entityType" : "item",
   "entityId" : "iPad",
   "properties" : {
      "category": ["electronics", "mobile"],
      "expireDate": "2016-10-05T21:02:49.228Z"
   },
   "eventTime" : "2015-10-05T21:02:49.228Z"
}
{
   "event":"$set",
   "entityType":"item",
   "entityId":"Mr Robot",
   "properties": {
      "content-type":["tv show"],
      "genres":["suspense","sci-fi", "drama"],
      "actor":["Rami Malek", "Christian Slater"],
      "keywords":["hacker"],
      "first_aired":["2015"]
   }
   "eventTime" : "2016-10-05T21:02:49.228Z"
}

Unless a property has a special meaning specified in engine instances JSON conifg, like date values, the property is assumed to be an array of strings, which act like categorical tags.

Delete Items

The UR keeps a collection of item objects in its store. These are created when properties are first $set for an item-id. These will grow without limit for each new item-id unless they are removed using the item $delete event. This does not remove any indicator events, which are only accessible using a TTL (see UR Config)

{
   "event":"$delete",
   "entityType":"item",
   "entityId":"Mr Robot",
   "eventTime" : "2016-10-05T21:02:49.228Z"
}

Brief Intro to "Rules"

You can add things like "premium" to the "tier" property then later if the user is a subscriber you can set a filter that allows recommendations from "tier": ["free", "premium"] where a non subscriber might only get recommendations for "tier": ["free"]. These are passed in to the query using the "rules" parameter (see Queries).

Using properties is how boost and filter rules are applied to recommended items. It may seem odd to treat a category as a filter and as a secondary event (category-preference) but the two pieces of data are used in quite different ways. As properties they bias the recommendations, when they are events they add to user data that returns recommendations. In other words as properties they work with boost and filter business rules as secondary usage events they show something about user taste to make recommendations better.