Apache
Home » Documentation » Bundles

Sling Pipes

tool for doing extract - transform - load operations through a resource tree configuration.

often one-shot data transformations need sample code to be written & executed. This tiny tool set intends to provide ability to do such transformations with proven & reusable blocks called pipes, streaming resources from one to the other.

What is a pipe

         getOutputBinding
               ^
               |
 getInput  +---+---+   getOutput
           |       |
      +----> Pipe  +---->
           |       |
           +-------+

A sling pipe is essentially a sling resource stream:

At this moment, there are 3 types of pipes to consider:

A Plumber osgi service is provided to help getting & executing pipes.

Registered Pipes

a pipe configuration is a jcr node, with:

readers

Base pipe

rather dummy pipe, outputs what is in input (so what is configured in path). Handy for doing some test mostly, and giving basic functionalities to others that inherit from it

SlingQuery Pipe

executes $(getInput()).children(expression)

JsonPipe

feeds bindings with remote json

MultiPropertyPipe

iterates through values of input multi value property and write them to bindings

XPathPipe

retrieve resources resulting of an xpath query

JsonPipe

feeds bindings with remote json

AuthorizablePipe

retrieve authorizable resource corresponding to the id passed in expression, or if not found (or void expression), from the input path, output the found authorizable's resource

ParentPipe

outputs the parent resource of input resource - sling:resourceType is slingPipes/parent

FilterPipe

outputs the input resource if its matches its configuration

containers

Container Pipe

assemble a sequence of pipes

ReferencePipe

execute the pipe referenced in path property

writers

Write Pipe

writes given nodes & properties to current input

MovePipe

JCR move of current input to target path (can be a node or a property)

RemovePipe

removes the input resource, returns the parent, regardless of the resource being a node, or a property

PathPipe

get or create path given in expression

Making configuration dynamic with pipe bindings

in order to make things interesting, most of the configurations are javascript template strings, hence valid js expressions reusing bindings (from configuration, or other pipes).

Following configurations are evaluated:

you can use name of previous pipes in the pipe container, or the special binding path, where path.previousPipe is the path of the current resource of previous pipe named previousPipe

global bindings can be set at pipe execution, external scripts can be added to the execution as well (see pipe configurations)

How to execute a pipe

for now it's possible to execute Pipes through GET (read) or POST (read/write) commands:

Request Path

which will return you the path of the pipes that have been through the output of the configured pipe.

Request Parameter binding

you can add as bindings parameter a json object of global bindings you want to add for the execution of the pipe

e.g.

curl -u admin:admin -F "path=/etc/pipes/test" -F "bindings={testBinding:'foo'}" http://localhost:4502/etc/pipes.json

will returns something like

{"size":2, "items":["/one/output/resource", "another/one"]}

Request Parameter writer

you can add as writer parameter a json object as a pattern to the result you want to have. The values of the json object are expressions and can reuse each pipe's subpipe binding. Note this works only if the pipe called is a container pipe.

e.g.

curl -u admin:admin http://localhost:4502/etc/pipes/users.json?writer={"user":"${user.fullName}"}

will returns something similar to

{"size":2, "items":[{'user':'John Smith','path':'/home/users/q/q123jk1UAZS'},{'user':'John Doe','path':'/home/users/q/q153jk1UAZS'}]}

Request Parameter dryRun

if parameter dryRun is set to true, and the executed pipe is supposed to modify content, it will log (at best it can) the change it would have done, without doing anything

Request Parameter size

default response is truncated to 10 items, if you need more (or less), you can modify that settings with the size parameter

sample configurations

slingQuery | write

this pipe parse all profile nodes, and

{
  "sling:resourceType":"slingPipes/container",
  "name":"Dummy User prefix Sample",
  "jcr:description":"prefix all full names of profile with "Mr" or "Ms" depending on gender",
  "conf":{
    "profile": {
        "sling:resourceType":"slingPipes/slingQuery",
        "expr":"nt:unstructured#profile",
        "path":"/home/users"
    },
    "writeFullName": {
        "sling:resourceType":"slingPipes/write",
        "conf": {
            "fullName":"${(profile.gender === 'female' ? 'Ms ' + profile.fullName : 'Mr ' + profile.fullName)}",
            "generatedBy":"slingPipes"
        }
    }
  }
}

slingQuery | multiProperty | authorizable | write

{
  "jcr:primaryType": "sling:Folder",
  "jcr:description": "move badge<->user relation ship from badge MV property to a user MV property"
  "name": "badges",
  "sling:resourceType": "slingPipes/container",
  "conf": {
    "jcr:primaryType": "sling:OrderedFolder",
    "badge": {
      "jcr:primaryType": "sling:Folder",
      "jcr:description": "outputs all badge component resources",
      "expr": "[sling:resourceType=myApp/components/badge]",
      "path": "/etc/badges/badges-admin/jcr:content",
      "sling:resourceType": "slingPipes/slingQuery"
      },
    "profile": {
      "jcr:primaryType": "sling:Folder",
      "jcr:description": "retrieve all user ids from a mv property",
      "path": "${path.badge}/profiles",
      "sling:resourceType": "slingPipes/multiProperty"
    },
    "user": {
      "jcr:primaryType": "sling:OrderedFolder",
      "jcr:description": "outputs user resource",
      "expr": "profile",
      "sling:resourceType": "slingPipes/authorizable"
    },
    "write": {
      "jcr:primaryType": "sling:OrderedFolder",
      "jcr:descritption": "patches the badge path to the badges property of the user profile"
      "path": "${path.user}/profile",
      "sling:resourceType": "slingPipes/write",
      "conf": {
        "jcr:primaryType": "nt:unstructured",
        "badges": "+[${path.badge}]"
      }
    }
  }
}

xpath | json | write

this use case is for completing repository profiles with external system's data (that has an json api)

{
  "jcr:primaryType": "nt:unstructured",
  "jcr:description": "this pipe retrieves json info from an external system and writes them to the user profile, uses moment.js, it
  distributes modified resources using publish distribution agent",
  "sling:resourceType": "slingPipes/container",
  "distribution.agent": "publish",
  "additionalScripts": "/etc/source/moment.js",
  "conf": {
    "jcr:primaryType": "sling:OrderedFolder",
    "profile": {
      "jcr:primaryType": "sling:OrderedFolder",
      "expr": "/jcr:root/home/users//element(profile,nt:unstructured)[@uid]",
      "jcr:description": "query all user profile nodes",
      "sling:resourceType": "slingPipes/xpath"
    },
    "json": {
      "jcr:primaryType": "sling:OrderedFolder",
      "expr": "${(profile.uid ? 'https://my.external.system.corp.com/profiles/' + profile.uid.substr(0,2) + '/' + profile.uid + '.json' : '')",
      "jcr:description": "retrieves json information relative to the given profile, if the uid is not found, expr is empty: the pipe will do nothing",
      "sling:resourceType": "slingPipes/json"
    },
    "write": {
      "jcr:primaryType": "sling:OrderedFolder",
      "path": "path.profile",
      "jcr:description": "write json information to the profile node",
      "sling:resourceType": "slingPipes/write",
      "conf": {
        "jcr:primaryType": "sling:OrderedFolder",
        "background": "${json.opt('background')}",
        "about": "${json.opt('about')}",
        "birthday": "${(json.opt('birthday') ? moment(json.opt('birthday'), \"MMMM DD\").toDate() : '')}",
        "mobile": "${json.opt('mobile')}"
      }
    }
  }
}

xpath | parent | rm

{
  "jcr:primaryType": "nt:unstructured",
  "jcr:description": "this pipe removes user with bad property in their profile",
  "sling:resourceType": "slingPipes/container",
  "conf": {
    "jcr:primaryType": "sling:OrderedFolder",
    "profile": {
      "jcr:primaryType": "sling:OrderedFolder",
      "expr": "/jcr:root/home/users//element(profile,nt:unstructured)[@bad]",
      "jcr:description": "query all user profile nodes with bad properties",
      "sling:resourceType": "slingPipes/xpath"
    },
    "parent": {
      "jcr:primaryType": "sling:OrderedFolder",
      "jcr:description": "get the parent node (user node)",
      "sling:resourceType": "slingPipes/parent"
    },
    "rm": {
      "jcr:primaryType": "sling:OrderedFolder",
      "jcr:description": "remove it",
      "sling:resourceType": "slingPipes/rm",
    }
  }
}

some other samples are in https://github.com/npeltier/sling-pipes/tree/master/src/test/

Compatibility

For running this tool on a sling instance you need:

Rev. 1731045 by rombert on Thu, 18 Feb 2016 10:54:15 +0000
Apache Sling, Sling, Apache, the Apache feather logo, and the Apache Sling project logo are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.