Action ML Logo

Table of contents

PredictionIO CLI Cheatsheet

PredictionIO can be seen as 2 types of servers, one takes in and stores events—the EvnetServer—and the other serves prediction—the PredictionServer. The general non-template specific commands can be run from anywhere, in any directory but the template specific commands must be run in the directory of the specific engine-instance being used, this is because some commands rely on files (like engine.json) to be available.

The typical process from install to your first query is:

  1. Install PredictionIO-0.12.1 using instructions here
  2. Start pio with one of the methods listed below, perhaps just pio-start-all if you are using a single machine (do not use these on the AWS AMI!) and check it with pio status
  3. Create an app in the EventServer to store data to
  4. import data into the EventServer
  5. Download a template
  6. Build the template
  7. Train the template, which reads data and creates a model
  8. Deploy the template
  9. You are now ready to query the deployed template

General Commands

At any point you can run pio help some-command to get a help screen printed with all supported options for a command.

Start/stop Services

PredictionIO assumes that HDFS and Spark are running. From a clean start launch them first. Warning: do not start services on the AWS AMI, they are alrready started at boot.

HDFS and Spark may be left running since nothing in this cheatsheet will stop them and they are started at boot on the AWS AMI.

Status and Information

Import Data

The EventServer can hold data as soon as you have created an app as above. Then you can choose to import JSON events from files (the fastest method) or use the REST API to import.

If you want to use the REST method you will use and SDK or make raw REST post calls. To add events from a shell script you would use curl to post to the EventServer on port 7070 of your host. Like this:

$ curl -i -X POST http://localhost:7070/events.json?accessKey=some-key \
-H "Content-Type: application/json" \
-d '{
  "event": "my_event",
  "entityType": "user"
  "entityId": "user-id",
  "targetType": "item",
  "targetEntityId": "item-id"
  "eventTime" : "2004-12-13T21:39:45.618-07:00"

Events are defined by the template so check the specific template docs for encoding data in events.

Workflow Commands

For some pio commands you must cd to an engine-instance directory. This is because the engine.json and/or manifest.json are either needed or are modified. These commands implement the workflow for creating a "model" from events and launching the PredictionServer to serve queries.

Standard Workflow

These commands must be run in this order, but can be repeated once previous commands are run. So many trains are expected after a build and many deploys of the same model are allowed.

Assuming there is data in the EventServer and engine.json is configured correctly:

Universal Recommender Example

Get PIO and required services running

To restart to a clean state

Prepare to import Events

Build The Universal Recommender