The Sling Content Distribution module main goal is allowing distribution of content (Sling resources) among different Sling instances. The term "distribution" here means the ability of picking one or more resources on a certain Sling instance in order to copy and persist them onto another Sling instance. The Sling Content Distribution module is able to distribute content by:
The Sling Content Distribution module consists of the following bundles:
org.apache.sling.distribution.api
: this is where the APIs are definedorg.apache.sling.distribution.core
: this is where the basic infrastructure for distributing content is implementedorg.apache.sling.distribution.kryo-serializer
: Kryo based distribution package serializerorg.apache.sling.distribution.avro-serializer
: Apache Avro based distribution package serializerorg.apache.sling.distribution.sample
: this is a set of sample configurations and implementations for demo purposeorg.apache.sling.distribution.it
: this is the integration testing suiteThe Sling Content Distribution aims to be: Reliable, simple and extensible.
Reliability means that the system should be able to keep working also in presence of failures regarding I/O, network, etc. An example of such problems is when pushing content from instance A to instance B fails because B is unreachable: in such scenarios instance A should be able to keep pushing (pulling, etc.) content to other instances seamlessly. Another example is when delivery of a certain content (package) fails too many times the distribution module should be able to either drop it or move it into a different "bucket" of failed items. Simplicity means that this module should be able to accomplish its tasks by providing clear, minimal and easy to use APIs together with smart but not overly complicated or "hacky" implementations (see "Simple software is hard"). Extensibility means that the Sling Content Distribution module provides a set of APIs for distributing resources where each component coming into place during the distribution lifecycle can be extended or totally replaced.
A distribution request represents the need of aggregating some resources and to copy them from / to another Sling instance. Such requests are handled by agents that are the main entry point for working with the distribution module. Each agent distributes content from one or more sources to one or more targets, such distribution can be triggered by:
An agent is capable of handling a certain distribution request by creating one or more packages of resources out of it from the source(s), dispatching such packages to one or more queues and of processing such queued packages by persisting them into the target instance(s).
The process of creating one or more packages is called exporting as such operation may either happen locally to the agent (the "push" scenario) or remotely (the "pull" scenario).
The process of persisting one or more packages is called importing as such operation may either happen locally (the "pull" scenario) or remotely (the "push" scenario).
In order to properly handle large number of requests against the same agent each of them is provided with queues where the exported packages are sent, the agent takes then care to process such a queue in order to import each package.
Distribution agents configurations are proper OSGi configurations (backed by nodes of type sling:OsgiConfig
in the repository).
There are specialized factories for each supported scenario:
For example a "forward" agent can be defined specifying
The sample package contains endpoints for exposing configuration for distribution agents. The DistributionConfigurationResourceProviderFactory is used to expose agent configurations as resources.
{
"jcr:primaryType": "sling:OsgiConfig",
"provider.roots": [ "/libs/sling/distribution/settings/agents" ],
"kind" : "agent"
}
Distribution agents' configurations can be retrieved via HTTP GET
:
$ curl -u admin:admin http://localhost:8080/libs/sling/distribution/settings/agents/{agentName}.json
Each distribution agent is an OSGi service and is resolved using a Sling Resource Provider who locate it under libs/sling/distribution/services/agents
.
The DistributionConfigurationResourceProviderFactory allows one to configure HTTP endpoints to access distribution OSGI configurations. The sample package contains endpoints for exposing distribution agents. The DistributionServiceResourceProviderFactory is used to expose agent services as resources.
{
"jcr:primaryType": "sling:OsgiConfig",
"provider.roots": [ "/libs/sling/distribution/services/agents" ],
"kind" : "agent"
}
Distribution agents can be triggered by sending HTTP POST
requests to
http://$host:$port/libs/sling/distribution/services/agents/{agentName}
with HTTP parameters action
and path
.
That's a draft implementation using an in memory blocking queue together with a Sling scheduled processor which periodically fetches the first item of each queue and trigger a distribution of such an item. It's not suitable for production as it's currently not persisted and therefore restarting the bundle / platform would not keep the queue together with its items.
That's a queue implementation based on the queues and jobs provided by Sling Event bundle. Each item addition to a queue triggers the creation of a Sling job which will handle the processing of that item in the queue. By default Sling queues for distribution have the following options:
Each distribution agent uses a specific queue distribution mechanism, specified via a 'queue distribution strategy', which defines how packages are routed into agent queues. The currently available distribution strategies are
In order to configure the "forward" distribution workflow, that transfers content from an author instance to a publish instance:
Send HTTP POST
request to http://localhost:8080/libs/sling/distribution/services/agents/publish
with parameters action=ADD
and path=/content
$ curl -v -u admin:admin http://localhost:8080/libs/sling/distribution/services/agents/publish -d 'action=ADD' -d 'path=/content/sample1'
$ curl -v -u admin:admin http://localhost:8080/libs/sling/distribution/services/agents/publish -d 'action= DELETE' -d 'path=/content/sample1'
In order to configure the "reverse" distribution workflow, that transfers content from a publish instance to an author instance: - configure a queue agent on publish to hold the packages that need to be distributed to author - configure a remote exporter on publish that exports package from the queue agent - configure a "reverse" agent on author pointing to the url of the exporter on publish
Send HTTP POST
request to http://localhost:8080/libs/sling/distribution/services/agents/publish-reverse
with parameters action=PULL
$ curl -u admin:admin http://localhost:8081/libs/sling/distribution/services/agents/reverse -d 'action=ADD' -d 'path=/content/sample1'
$ curl -u admin:admin http://localhost:8080/libs/sling/distribution/services/agents/publish-reverse -d 'action=PULL'
In order to configure the "sync" distribution workflow, that transfers content from two publish instances via an author instance: - configure a remote exporter on each publish instance - configure a remote importer on each publish instance - configure a "sync" agent on author pointing to the urls of the exporter and importers on publish
Send HTTP POST
request to http://localhost:8080/libs/sling/distribution/services/agents/pubsync
with parameters action=PULL
$ curl -u admin:admin http://localhost:8081/libs/sling/distribution/services/agents/reverse-pubsync -d 'action=ADD' -d 'path=/content/sample1'
$ curl -u admin:admin http://localhost:8080/libs/sling/distribution/services/agents/pubsync -d 'action=PULL'
We need to expose APIs for configuring, commanding and monitoring distribution agents.
There is a single entry point in triggering a distribution workflow, via Distributor API.
Distributor.distribute(agentName, resourceResolver, distributionRequest)
The following extensions for Apache Sling Content Distribution exist.
The org.apache.sling.distribution.avro-serializer contains a DistributionContentSerializer based on Apache Avro.
The org.apache.sling.distribution.kryo-serializer contains a DistributionContentSerializer based on Kryo.