Action ML Logo

Table of contents

Queries

The Universal Recommender has a reasonable set of defaults so queries can be very simple or, when the need arises, very flexible. There are two parts of all UR Queries:

  1. The Subject: This is a user-id, item-id, or list of item-ids. Recommendations consist of the best item matches to the subject. For users this means the items that have had interactions from the most similar users. For items is means other items that have the most similar interactions.
  2. Business Rules: These consist of restrictions on the items that can be recommended and define matches to item properties that will allow. boost, or exclude them as recommendations. If an item cannot be recommended the Business Rules will never cause them to be returned for a Query.

The Simplest Query

{}

This has no user, items, or item-set so it can do nothing but return popular items. For every model there is one "popularity" method chosen, that ranks all items that have interactions. This translates to all items that are the target of Events/Indicators. Configuration for the model is supplied in the UR engines JSON algorithm parameters and if not given, just counts the number of primary Indicator Events per item for all recorded time. See Configuration for other tuning of the "popularity" ranking.

Simple Personalized User-based Query

{
  "user": "John Doe"
}

This gets historical data for "John Doe" and uses this to return the best personalized recommendations. John has left a record of indicators by interacting with a site or app in a way that triggered Events. So depending on what was recorded we would expect maybe purchases, search terms, category-prefs to be in John's history (Harness records this in realtime). So only a moment ago, if John searched, these terms might be used in making recommendations.

Item-based Query

{
  "item": "iPad"
}

This query is the kind you see at the bottom of a product page in Amazon. It shows "other people who bought 'iPad' also liked these". It returns items that have seen similar user behavior. This means it is non-personalized (after all there is no user in the query) but finds items which on average are similar to the item in the query. Similarity here is only based on the indicators of other users, not item properties. This type of recommendation is quite useful as the Amazon use case shows, and can be used if you know nothing about the user. It may be useful to think of these as item-details page recommendations.

Simple Item-set Query (Wishlist)

This query applies to a wide variety of lists, like watchlists, favorites, shopping carts, wishlists, etc. It returns Similar iItems to the items in the list.

{
  "itemSet": ["item-1", "item-5", "item-300", "item-2", ...]   
}

To get the missing items you will need to train a separate model on item-sets (not individual user behavior). However even if your model is made from user behavior this query has a place for lists that do not naturally have missing pieces.

Complimentary Items (Shopping-cart)

The query will be exactly the same as an item-set but to get "Complimentary Items" based on things that go together, you need to create a model from some grouping of items like things bought together in a shopping cart or items viewed in a session, etc. Doing this will turn the query from "find items similar to all these" into "find the missing items from this list" this is an important distinction. For example similar items may all look the same, complimentary items may look quite different. If John has a Galaxy S10 and Galaxy Case in this shopping cart, would it be better to recommend "USB-C cord" or "Galaxy S9"? Past experience and research shows that Complimentary Items get better results than Similar Items in many cases.

This topic is more advanced and takes us into methods for generating different model types so see Use Cases

Business Rules

Business Rules overrule the ranking of items recommended. They put restrictions on what a query can return or reorder them. These restrictions are based on item properties. Therefore add item properties (using $set Events) to support the Rules you need. For example $set the in-stock property to true for items that can be converted on, if you later want to restrict recommendations to in-stock items.

Since Business Rules restrict items that can be recommended they will theoretically result in fewer conversions overall. However this may not be true in practice. For instance requiring an item be in-stock will likely cause more conversions. However be careful of using business rules just because they seem correct. If intuition is wrong they can reduce performance.

Rule Types

The "name" and "value" of the Rule definition below are fairly clear. The "bias" however picks which of the above types are executed:

The "name" identifies the property name to match. The "values" provide a list of values that the properties of the recommended items are matched against. If the Rule is inclusion then at least one of the values must match the property values. if the Rule is exclusion then no value is allowed to match the recommended items. If the Rule is a boost them effectively the score of the recommendation is multiplied by the bias, taking into account how many values match the item.

The way to think about Boost Rules is that they favor recommended items that match more of the values and so raise them in the ranking. If the score of a non-matching recommended item is still higher than the boosted score it will be recommended higher, but it is possible to increase the bias amount to almost guarantee items matching the Rules will come out on top.

Personalized with Rules

{
  "user": "John Doe",
  "rules": [
    {
      "name": "categories"
      "values": ["series", "mini-series"],
      "bias": -1 // filter out all except 'series' or 'mini-series'
    },{
      "name": "genre",
      "values": ["sci-fi", "detective"]
      "bias": 1.02 // boost/favor recommendations with the 'genre' = 'sci-fi' or 'detective'
    }
  ]
}

This returns items based on user "John Doe" history including only the categories mentioned and boosted a little bit to favor more genres specified. (Note: the values for properties have been attached to items with "$set" events) The "bias" is used to indicate a filter or a boost. As always the blacklist, backfill, and other settings affecting results use the defaults in Engine's config.

Date Ranges in Queries

When the a date is stored in the items properties.

{
  "user": "xyz", 
  "fields": [
    {
      "name": "categories"
      "values": ["series", "mini-series"],
      "bias": -1 }// filter out all except 'series' or 'mini-series'
    },{
      "name": "genre",
      "values": ["sci-fi", "detective"]
      "bias": 1.02 // boost/favor recommendations with the 'genre' = 'sci-fi' or 'detective'
    }
  ],
  "dateRange": {
    "name": "availabledate",
    "before": "2015-08-15T11:28:45.114-07:00",
    "after": "2015-08-20T11:28:45.114-07:00       
  }
}

Items are assumed to have a field of the same "name" that has a date associated with it using a "$set" event. The query will return only those recommendations where the date field is in "dateRange". Either date bound can be omitted for a one-sided range. The range applies to all returned recommendations, even those for popular items.

Queries using Available and Expire Dates

When setting an available date and expire date on Items you will set the name of the fields to be used in the Engine's config expireDateName and availableDateName params, the current date of the Harness server will be used as a filter. The UR will check that the current date is before the expire date, and after the available date. If the above fields are defined in the Engine's config a date must accompany any query since all items are assumed to have this property. When setting these values for item properties both must be specified so if a one-sided query is desired set the available date to some time in the past and/or the expire date to sometime far in the future, this guarantees that the item will not be filtered out from one or the other limit. If the available and expire fields are named in the Engine's config then the current Harness server date will be used.

Note: a somewhat hidden effect of this is that if these fields are specified in the Engine's config the filter will apply to every query made. So diabling for specific items must be done by manipulating their available/expire dates.

{
  "user": "xyz", 
  "fields": [
    {
      "name": "categories"
      "values": ["series", "mini-series"],
      "bias": -1 }// filter out all except 'series' or 'mini-series'
    },{
      "name": "genre",
      "values": ["sci-fi", "detective"]
      "bias": 1.02	    }
  ],
}

Note: No values are supplied in the query is the date names are in the Engine's Config so this looks the same as a query with no available/expire. Check the Engine's config to know for sure if this kind of filter is being used.

Personalized Queries with Similar Items

{
  "user": "John Doe", 
  "userBias": 2, // favor personalized recommendations
  "item": "Mr Robot", // fallback to contextual recommendations
  "rules": [
    {
      "name": "categories"
      "values": ["series", "mini-series"],
      "bias": -1 }// filter out all except 'series' or 'mini-series'
    },{
      "name": "genre",
      "values": ["sci-fi", "detective"]
      "bias": 1.02 // boost/favor recommendations with the 'genre' = 'sci-fi' or 'detective'
    }
  ]
}

This returns items based on "John Doe"s history or similar to item "Mr Robot" but favoring user history-based recommendations. These are filtered by categories and boosted to favor genre specific items.

Note:This query should be considered experimental. Mixing user history with item similarity is possible but may have unexpected results. It is also possible to make 2 queries, one user-based and one item-based and mix the results and this may be better.

Query Parameter Specification

Query fields determine what data is used to match when returning recommendations.

{
  "user": "John Doe", 
  "userBias": -maxFloat..maxFloat,
  "item": "iPad", 
  "itemBias": -maxFloat..maxFloat,    
  "itemSet": ["cd53454543513", "lg1", "vf23423432", "af87634"], 
  "itemSetBias": -maxFloat..maxFloat,  
  "from": 0,
  "num": 4,
  "rules": [
    {
      "name": "fieldname"
      "values": ["fieldValue1", ...],
      "bias": -maxFloat..maxFloat 
    },...
  ]
  "dateRange": {
    "name": "dateFieldname",
    "before": "2015-09-15T11:28:45.114-07:00",
    "after": "2015-08-15T11:28:45.114-07:00"
  },
  "blacklistItems": ["itemId1", "itemId2", ...]
  "returnSelf": true | false,
}

Defaults are either noted or taken from algorithm values, which themselves may have defaults. This allows very simple queries for the simple, most used cases.

The query returns popular, personalized recommendations, similar items, similar to an item-set, or a mix of all these. The query itself determines this by supplying item, user, item-set, or any mix.