1. Overview

ProActive Scheduler is a free and open-source job scheduler. The user specifies the computation in terms of a series of computation steps along with their execution and data dependencies. The Scheduler executes this computation on a cluster of computation resources, each step on the best-fit resource and in parallel wherever its possible.

architecture

The administration guide covers cluster setup and cluster administration. Cluster setup includes two main steps:

1.1. Glossary

The following terms are used throughout the documentation:

ProActive Workflows & Scheduling

The full distribution of ProActive for Workflows & Scheduling, it contains the ProActive Scheduler server, the REST & Web interfaces, the command line tools. It is the commercial product name.

ProActive Scheduler

Can refer to any of the following:

  • A complete set of ProActive components.

  • An archive that contains a released version of ProActive components, for example ProActiveWorkflowsScheduling-OS-ARCH-VERSION.zip.

  • A set of server-side ProActive components installed and running on a Server Host.

Resource Manager

ProActive component that manages ProActive Nodes running on Compute Hosts.

Scheduler

ProActive component that accepts Jobs from users, orders the constituent Tasks according to priority and resource availability, and eventually executes them on the resources (ProActive Nodes) provided by the Resource Manager.

Please note the difference between Scheduler and ProActive Scheduler.
REST API

ProActive component that provides RESTful API for the Resource Manager, the Scheduler and the Workflow Catalog.

Resource Manager Web Interface

ProActive component that provides a web interface to the Resource Manager.

Scheduler Web Interface

ProActive component that provides a web interface to the Scheduler.

Workflow Studio

ProActive component that provides a web interface for designing Workflows.

Workflow Catalog

ProActive component that provides storage and versioning of Workflows through a REST API. It is also possible to query the catalog for specific Workflows.

Bucket

ProActive notion used with the Workflow Catalog to refer to a specific collection of workflows.

Server Host

The machine on which ProActive Scheduler is installed.

SCHEDULER_ADDRESS

The IP address of the Server Host.

ProActive Node

One ProActive Node can execute one Task at a time. This concept is often tied to the number of cores available on a Compute Host. We assume a task consumes one core (more is possible, see multi-nodes tasks, so on a 4 cores machines you might want to run 4 ProActive Nodes. One (by default) or more ProActive Nodes can be executed in a Java process on the Compute Hosts and will communicate with the ProActive Scheduler to execute tasks.

Compute Host

Any machine which is meant to provide computational resources to be managed by the ProActive Scheduler. One or more ProActive Nodes need to be running on the machine for it to be managed by the ProActive Scheduler.

Examples of Compute Hosts:

PROACTIVE_HOME

The path to the extracted archive of ProActive Scheduler release, either on the Server Host or on a Compute Host.

Workflow

User-defined representation of a distributed computation. Consists of the definitions of one or more Tasks and their dependencies.

Workflow Revision

ProActive concept that reflects the changes made on a Workflow during it development. Generally speaking, the term Workflow is used to refer to the latest version of a Workflow Revision.

Job

An instance of a Workflow submitted to the ProActive Scheduler. Sometimes also used as a synonym for Workflow.

Task

A unit of computation handled by ProActive Scheduler. Both Workflows and Jobs are made of Tasks.

Node Source

A set of ProActive Nodes deployed using the same deployment mechanism and sharing the same access policy.

ProActive Agent

A daemon installed on a Compute Host that starts and stops ProActive Nodes according to a schedule, restarts ProActive Nodes in case of failure and enforces resource limits for the Tasks.

2. Get Started

Download ProActive Scheduler and unzip the archive.

The extracted folder will be referenced as PROACTIVE_HOME in the rest of the documentation. The archive contains all required dependencies.

ProActive Scheduler is ready to be started with no extra configuration.

$ PROACTIVE_HOME/bin/proactive-server
The router created on localhost:33647
Starting the scheduler...
Starting the resource manager...
The resource manager with 4 local nodes created on pnp://localhost:41303/
The scheduler created on pnp://localhost:41303/
Starting the web applications...
The web application /scheduler created on http://localhost:8080/scheduler
The web application /rm created on http://localhost:8080/rm
The web application /rest created on http://localhost:8080/rest
The web application /studio created on http://localhost:8080/studio
The web application /workflow-catalog created on http://localhost:8080/workflow-catalog
*** Get started at http://localhost:8080 ***

The following ProActive Scheduler components are started:

The URLs of the Scheduler, Resource Manager, REST API and Web Interfaces are displayed in the output.

Default credentials: admin/admin

Your ProActive Scheduler is ready to execute Jobs!

3. ProActive Scheduler configuration

All configuration files of ProActive Scheduler can be found under PROACTIVE_HOME.

Table 1. ProActive Scheduler configuration files
Component Description File Reference

Scheduler

Scheduling Properties

config/scheduler/settings.ini

Scheduler Properties

Resource Manager

Node management configuration

config/rm/settings.ini

Resources Manager Properties

Web Applications

REST API and Web Applications configuration

config/web/settings.ini

REST API & Web Properties

Networking

Network, firewall, protocols configuration

config/network/node.ini, config/network/server.ini

Network Properties

Security

User logins and passwords

config/authentication/login.cfg

File

User group assignments

config/authentication/group.cfg

File

User permissions

config/security.java.policy-server

User Permissions

LDAP configuration

config/authentication/ldap.cfg

LDAP

4. Installation on a Cluster

Adding Compute Hosts of a cluster to the ProActive Scheduler typically involves unpacking the release archive on all those hosts. Once it’s done you need to run a ProActive Node on the Compute Host and connect it to the ProActive Scheduler. There are two principal ways of doing that:

  • Launch a process on the Compute Host and connect it to the ProActive Scheduler

  • Initiate the deployment from the ProActive Scheduler: Node Source creation

If you are not familiar with ProActive Scheduler you may want to try the first method as it’s easier to understand. Combined with ProActive Agents, it gives you the same result as the second method.

The second method implies that you have a remote access to Compute Hosts (e.g. SSH access) and you want to start and stop ProActive Nodes by launching commands remotely on Compute Hosts. For instance, it can be useful when a virtual machine needs to be deployed prior to launching a ProActive Node.

4.1. Deploy ProActive Nodes manually

4.1.1. Using PROACTIVE_HOME

Let’s take a closer look at the first method described above. To deploy a ProActive Node from the Compute Host you need to run the following command

$ PROACTIVE_HOME/bin/proactive-node -r pnp://SCHEDULER_ADDRESS:41303

where -r option is used to specify the URL of the Resource Manager. You can find this URL in the output of the ProActive Scheduler (the Resource Manager URL). If you want to run multiple tasks at the same time on the same machine, you can either start a few proactive-node processes or start multiple nodes from the same process using the -w command line option.

You can also use discovery to let the ProActive Node find the URL to connect to on its own. Simply run proactive-node without any parameter to use discovery. It uses broadcast to retrieve the URL so this feature might not work depending on your network configuration.

4.1.2. Using node.jar

It is also possible to launch a ProActive Node without even copying the ProActive Scheduler to a Compute Host:

  • Open a browser on the Compute Host.

  • Navigate to the Resource Manager Web Interface. You can find the URL in the output of the ProActive Scheduler (the Resource Manager web application URL).

  • Use default demo/demo account to access the Resource Manager Web Interface.

  • Click on 'Portal→Launch' to download node.jar.

  • Run it:

$ java -jar node.jar

It should connect to the ProActive Scheduler automatically using the discovery mechanism otherwise you might have to set the URL to connect to with the -r parameter.

If you would like to execute several Tasks at the same time on one host, you can either launch several ProActive Node process or use the -w parameter to run multiple nodes in the same process. A node executes one Task at a time.

4.2. Deploy ProActive Nodes via SSH

The second way of deploying ProActive Nodes is to create Node Sources from the ProActive Scheduler.

Examples of a Node Source:

  • a cluster with SSH access where nodes are available from 9 a.m. to 9 p.m.

  • nodes from Amazon EC2 available permanently for users from group 'cloud'.

When creating a Node Source, you can choose an Infrastructure Manager from the list of supported Node Source Infrastructures and a Node Source Policy that defines rules and limitations of nodes' utilization.

To create a Node Source you can do any of the following:

  • Use the Resource Manager Web Interface ('Add Nodes' menu)

  • Use the REST API

  • Use the Command Line

In order to create an SSH Node Source you should first configure an SSH access from the server to the Compute Hosts that does not require password. Then create a text file (refered to as HOSTS_FILE below) containig the hostnames of all your Compute Hosts. Each line shoud have the format:

HOSTNAME NODES_COUNT

where NODES_COUNT is the number of ProActive Nodes to start (corresponds to the number of Tasks that can be executed in parallel) on the corresponding host. Lines beginning with # are comments. Here is an example:

# you can use network names
host1.mydomain.com 2
host2.mydomain.com 4
# or ip addresses
192.168.0.10 8
192.168.0.11 16

Then using this file create a Node Source either from the Resource Manager Web Interface or from the command line:

$ PROACTIVE_HOME/bin/proactive-client -createns SSH_node_source --infrastructure org.ow2.proactive.resourcemanager.nodesource.infrastructure.SSHInfrastructure HOSTS_FILE 60000 3 \"\" /home/user/jdk/bin/java PROACTIVE_HOME Linux \"\" config/authentication/rm.cred

Don’t forget to replace SCHEDULER_ADDRESS, HOSTS_FILE and PROACTIVE_HOME with the corresponding values.

See SSH Infrastructure reference for details on each parameter.

4.3. Deploy ProActive Nodes via Agents

In your production environment you might want to control and limit the resources utilization on some or all Compute Hosts, especially if those are desktop machines where people perform their daily activities. Using ProActive Agent you can:

  • Control the number of ProActive Node processes on each Compute Host

  • Launch ProActive Nodes automatically when a Compute Host starts

  • Restart ProActive Nodes if they fail for some reason and reconnect them to ProActive Scheduler

ProActive Agents exists for both Linux and Windows operating systems.

4.3.1. ProActive Linux Agent

The ProActive Agent for Linux can be downloaded from ActiveEon’s website.

To install the ProActive Agent on Debian based distributions:

sudo dpkg -i proactive-agent-standalone.deb
Some extra dependencies might be required, to install them: sudo apt-get -f install

To install the ProActive Agent on Redhat based distributions:

sudo rpm -i proactive-agent-standalone.deb

By default the Linux Agent will launch locally as many nodes as the number of CPU cores available on the host minus one. The URL to use for connecting nodes to the Resource Manager can be configured in /opt/proactive-agent/config.xml, along with several other parameters. If no URL is set, then broadcast is used for auto discovery. However, this feature might not work depending on your network configuration. Consequently, it is recommended to set the Resource Manager URL manually.

Logs related to the Agent daemon are located in /opt/proactive-agent/proactive-agent.log.

Binaries, configuration files and logs related to ProActive nodes started by the agent are available at /opt/proactive-node/, respectively in subfolders dist/lib, config and logs.

Configuring the Linux Agent

To configure the agent behaviour:

  • Stop the Linux Agent

    sudo /etc/init.d/proactive-agent stop
  • Update the config.xml file in /opt/proactive-agent folder

    In case there is no such directory, create the directory. Then, create the config.xml file or create a symbolic link

    sudo mkdir -p /opt/proactive-agent
    sudo ln -f -s <path-to-config-file> /opt/proactive-agent/config.xml
  • Change the ownership of the config.xml file

    sudo chown -Rv proactive:proactive config.xml

    If the group named "proactive" does not exist, create the group and add "proactive" user to the group:

    sudo groupadd proactive
    sudo usermod -g proactive proactive
  • Start the Linux Agent

    sudo /etc/init.d/proactive-agent start
Uninstalling the Linux Agent

On Debian based distributions:

sudo dpkg -r proactive-agent

On Redhat based distributions:

sudo yum remove proactive-agent

4.3.2. ProActive Windows Agent

The ProActive Windows Agent is a Windows Service: a long-running executable that performs specific functions and which is designed to not require user intervention. The agent is able to create a ProActive Node on the local machine and to connect it to the ProActive Resource Manager.

After being installed, it:

  • Loads the user’s configuration

  • Creates schedules according to the working plan specified in the configuration

  • Spawns a ProActive Node that will run a specified java class depending on the selected connection type. 3 types of connections are available:

    • Local Registration - The specified java class will create a ProActive Node and register it locally.

    • Resource Manager Registration - The specified java class will create a ProActive Node and register it in the specified Resource Manager, thus being able to execute java or native tasks received from the Scheduler. It is is important to note that a ProActive Node running tasks can potentially spawn child processes.

    • Custom - The user can specify his own java class.

  • Watches the spawned ProActive Nodes in order to comply with the following limitations:

    • RAM limitation - The user can specify a maximum amount of memory allowed for a ProActive Node and its children. If the limit is reached, then all processes are automatically killed.

    • CPU limitation - The user can specify a maximum CPU usage allowed for a ProActive Node and its children. If the limit is exceeded by the sum of CPU usages of all processes, they are automatically throttled to reach the given limit.

  • Restarts the spawned ProActive Node in case of failures with a timeout policy.

4.3.3. Install Agents on Windows

The ProActive Windows Agent installation pack is available on the official ProActive website. Run the setup.exe file and follow instructions. When the following dialog appears:

install config
  1. Specify the directory that will contain the configuration file named PAAgent-config.xml, note that if this file already exists in the specified directory it will be re-used.

  2. Specify the directory that will contain the log files of the ProActive Agent and the spawned runtimes.

  3. Specify an existing, local account under which the ProActive Nodes will be spawned. It is highly recommended to specify an account that is not part of the Administrators group to isolate the ProActive Node and reduce security risks.

  4. The password is encrypted using Microsoft AES Cryptographic Provider and only Administrators have access permissions to the keyfile (restrict.dat) this is done using the SubInACLtool.

  5. If the specified account does not exist the installation program will prompt the user to create a non-admin account with the required privileges.

    Note that the ProActive Agent service is installed under LocalSystem account, this should not be changed, however it can be using the services.msc utility. ('Control Panel→Administrative Tools→Services')

  6. If you want that any non-admin user (except guest accounts) be able to start/stop the ProActive Agent service check the "Allow everyone to start/stop" box. If this option is checked the installer will use the SubInACL tool. If the tool is not installed in the Program Files\Windows Resource Kits\Tools directory the installer will try to download its installer from the official Microsoft page.

  7. The installer will check whether the selected user account has the required privileges. If not follow the steps to add these privileges:

    1. In the 'Administrative Tools' of the 'Control Panel', open the 'Local Security Policy'.

    2. In 'Security Settings', select 'Local Policies' then select 'User Rights Assignments'.

    3. Finally, in the list of policies, open the properties of 'Replace a process-level token' policy and add the needed user. Do the same for 'Adjust memory quotas for a process'. For more information about these privileges refer to the official Microsoft page.

At the end of the installation, the ProActive Agent Control utility should be started. This next section explains how to configure it.

To uninstall the ProActive Windows Agent, simply run 'Start→Programs→ProActiveAgent→Uninstall ProActive Agent'.

4.3.4. Configure Agents on Windows

To configure the Agent, launch 'Start→Programs→ProActiveAgent→AgentControl' program or click on the notify icon if the "Automatic launch" is activated. Double click on the tray icon to open the ProActive Agent Control window. The following window will appear:

agent control

From the ProActive Agent Control window, the user can load a configuration file, edit it, start/stop the service and view logs. A GUI for editing is provided (explained below). Even if it is not recommended, you can edit the configuration file by yourself with your favorite text editor.

It is also possible to change the ProActive Nodes Account using the 'Change Account' button.

When you click on 'GUI Edit', the following window appears:

config editor general

In the general tab, the user can specify:

  • The ProActive Scheduler location.

  • The JRE location (usually something like C:\Program Files\Java\jdk1.6.0_12).

  • The numbers of Runtimes and Nodes (the number of spawned processes and the number of ProActive Nodes per process).

  • The JVM options. Note that if the parameter contains ${rank}, it will be dynamically replaced by the ProActive Node rank starting from 0.

  • The On Runtime Exit script. A script executed after a ProActive Node exits. This can be useful to perform additional cleaning operation. Note that the script receives as parameter the PID of the ProActive Node.

  • The user can set a memory limit that will prevent the spawned processes to exceed a specified amount of RAM. If a spawned process or its child process requires more memory, it will be killed as well as its child processes. Note that this limit is disabled by default (0 means no limit) and a ProActive Node will require at least 128 MBytes.

  • It is possible to list all available network interfaces by clicking on the "Refresh" button and add the selected network interface name as a value of the proactive.net.interface property by clicking on "Use" button. See the ProActive documentation for further information.

  • The user can specify the protocol (PNP or PAMR) to be used by the ProActive Node for incoming communications.

  • To ensure that a unique port is used by a ProActive Node, the initial port value will be incremented for each node process and given as value of the -Dproactive.SELECTED_PROTOCOL.port JVM property. If the port chosen for a node is already used, it is incremented until an available port number is found.

Clicking on the 'Connection' tab, the window will look like this:

config editor connection

In the 'Connection' tab, the user can select between three types of connections:

  • Local Registration - creates a local ProActive node and registers (advertises) it in a local RMI registry. The node name is optional.

  • Resource Manager Registration - creates a local ProActive node and registers it in the specified Resource Manager. The mandatory Resource Manager’s url must be like protocol://host:port/. The node name and the node source name are optional. Since the Resource Manager requires authentication, the user specifies the file that contains the credential. If no file is specified the default one located in %USERPROFILE%\.proactive\security folder is used.

  • Custom - the user specifies his own java starter class and the arguments to be given to the main method. The java starter class must be in the classpath when the ProActive Node is started.

Finally, clicking on the "Planning" tab, the window will look like this:

config editor planning

In the Planning Tab, depending on the selected connection type, the agent will initiate it according to a weekly planning where each plan specifies the connection start time as well as the working duration. The agent will end the connection as well as the ProActive Nodes and its child processes when the plan duration has expired.

Moreover, it is possible to specify the ProActive Node Priority and its CPU usage limit. The behavior of the CPU usage limit works as follows: if the ProActive Node spawns other processes, they will also be part of the limit so that if the sum of CPU% of all processes exceeds the user limit they will be throttled to reach the given limit. Note that if the Priority is set to RealTime the CPU % throttling will be disabled.

The "Always available" makes the agent to run permanently with a Normal Priority and Max CPU usage at 100%.

4.3.5. Launching Windows Agent

Once you have configured the agent, you can start it clicking on the "Start" button of the ProActive Agent Control window. However, before that, you have to ensure that ProActive Scheduler has been started on the address you specified in the agent configuration. You do not need to start a node since it is exactly the job of the agent.

Once started, you may face some problems. You can realise that an error occurred by first glancing at the color of the agent tray icon. If everything goes right, it should keep the blue color. If its color changes to yellow, it means that the agent has been stopped. To see exactly what happened, you can look at the runtime log file located into the agent installation directory and named Executor<runtime number>Process-log.txt.

The main troubles you may have to face are the following ones:

  • You get an access denied error: this is probably due to your default java.security.policy file which cannot be found. If you want to specify another policy file, you have to add a JVM parameter in the agent configuration. A policy file is supplied in the scheduling directory. To use it, add the following line in the JVM parameter box of the agent configuration (Figure 5.3, “Configuration Editor window - General Tab ”):

-Djava.security.policy=PROACTIVE_HOME/config/security.java.policy-client
  • You get an authentication error: this is probably due to your default credentials file which cannot be found. In the "Connection" tab of the Configuration Editor (Figure 5.4, “Configuration Editor window - Connection Tab (Resource Manager Registration)”), you can choose the credentials file you want. You can select, for instance, the credentials file located at PROACTIVE_HOME/config/authentication/scheduler.cred or your own credentials file.

  • The node seems to be well started but you cannot see it in the Resource Manager interface : in this case, make sure that the port number is the good one. Do not forget that the runtime port number is incremented from the initial ProActive Resource Manager port number. You can see exactly on which port your runtime has been started looking at the log file described above.

5. Available Network Protocols

ProActive Workflows and Scheduling offers several protocols for the Scheduler and nodes to communicate. These protocols provide different features: speed, security, fast error detection, firewall or NAT friendliness but one of the protocols can offer all these features at the same time. Consequently, the selection should be made carefully. Below are introduced available network protocols. Configuration and properties are discussed in Network Properties.

5.1. ProActive Network Protocol

ProActive Network Protocol (PNP) is the general purpose communication protocol (pnp:// scheme). Its performances are quite similar to the well known Java RMI protocol, but it is much more robust and network friendly. It requires only one TCP port per JVM and no shared registry. Besides, it enables fast network failure discovery and better scalability. PNP binds to a given TCP port at startup. All incoming communications use this TCP port. Deploying the Scheduler or a node with PNP requires to open one and only one incoming TCP port per machine.

5.2. ProActive Network Protocol over SSL

ProActive Network Protocol over SSL (PNPS) is the PNP protocol wrapped inside an SSL tunnel. The URI scheme used for the protocol is pnps://. It includes the same features as PNP plus ciphering and optionally authentication. Using SSL creates some CPU overhead which implies that PNPS is slower than PNP.

5.3. ProActive Message Routing

ProActive Message Routing (PAMR) allows the deployment of the Scheduler and nodes behind a firewall. Its associated URI scheme is pamr://. PAMR has the weakest expectations on how the network is configured. Unlike all the other communication protocols introduced previously, it has been designed to work when only outgoing TCP connections are available.

6. Installation on a Cluster with Firewall

When incoming connections are not allowed (ports closed, firewalls, etc.), the ProActive Scheduler allows you to connect nodes without significant changes in your network firewall configuration. It relies on the PAMR protocol.

This last does not expect bidirectional TCP connections. It has been designed to work when only outgoing TCP connections are available. Such environments can be encountered due to:

  • Network address translation devices

  • Firewalls allowing only outgoing connections (this is the default setup of many firewalls)

  • Virtual Machines with a virtualized network stack

firewall

When PAMR is activated, the ProActive Scheduler and nodes connect to a PAMR router. This connection is kept open, and used as a tunnel to receive incoming messages. If the tunnel goes down, it is automatically reopened by nodes.

The biggest drawback of PAMR is that a centralized PAMR router is in charge of routing message between all the PAMR clients. To soften this limitation PAMR can be used with other communication protocols. This way, PAMR is used only when needed.

By default, PNP is enabled. PNP is the default protocol for better performance and nodes can also use PAMR if needed. The PAMR Router is started by default along with the ProActive Scheduler.

For a ProActive Node to connect to the ProActive Scheduler using PAMR, the following ProActive configuration file can be used (PROACTIVE_HOME/config/network/node.ini). The properties tell the ProActive Node where to find the PAMR router. The ProActive Node will then connect to pamr://0 where 0 is the PAMR id of the Scheduler (0 by default).

proactive.communication.protocol=pamr
proactive.pamr.router.address=ROUTER_HOSTNAME
proactive.pamr.router.port=33647

This sample configuration requires to open only one port 33647 for incoming connections on the router host and all the ProActive Nodes will be able to connect to the Scheduler.

PAMR communication can be tunneled using SSH for better security. In that case, the ProActive node will establish a SSH tunnel between him and the ProActive Scheduler and use that tunnel for PAMR traffic.

See PAMR Protocol Properties reference for a detailed explanation of each property.

7. Control the resource usage

7.1. Policies

You can limit the utilization of resources connected to the ProActive Scheduler in different ways. When you create node sources you can use a node source policy.

Node source policy is a set of rules and conditions which describes when and how many nodes have to be selected for computations.

Each node source policy regardless it specifics has a common part where you describe users' and groups' permissions. When you create a policy you must specify them:

  • nodeUsers - utilization permission that defines who can get nodes for computations from this node source. It has to take one of the following values:

    • ME - only the node source creator

    • users=user1,user2;groups=group1,group2;tokens=t1,t2 - only specific users, groups or tokens. I.e. users=user1 - node access is limited to user1; users=user1;groups=group1 - node access is limited to user1 and all users from group group1; users=user1;tokens=t1 - node access is limited to user1 or anyone who specified token t1. If node access is protected by a token, node will not be found by the ProActive Resource Manager (getNodes request) unless the corresponding token is specified.

    • ALL - everybody can use nodes from this node source

  • nodeProviders - provider permission defines who can add nodes to this node source. It should take one of the following values:

    • ME - pnly the node source creator

    • users=user1,user2;groups=group1,group2 - only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - everybody can add nodes to this node source

The user who created the node source is the administrator of this node source. He can add and removed nodes to it, remove the node source itself, but cannot use nodes if usage policy is set to PROVIDER or PROVIDER_GROUPS.

In the ProActive Resource Manager, there is always a default node source configured with a DefaultInfrastructureManager and a Static policy. It is not able to deploy nodes anywhere but makes it possible to add existing nodes to the Scheduler (see Deploy ProActive Nodes manually)

Out of the box the Scheduler supports time slot policies, cron policies, load based policies and many others. Please see detailed information about policies in Node Source Policy.

7.2. Agents schedule

Node source policies limit ProActive Nodes utilization on the level of the ProActive Scheduler. If you need fine-grained limits on the node level ProActive Agents will help you achieve that.

The typical scenario is when you use desktop workstation for computations during non working hours.

Both linux and windows agents have an ability to:

  • Run ProActive Nodes according to the schedule

  • Limit resources utilization for these daemons (e.g CPU, memory)

Agents configuration is detailed in the section Deploy ProActive Nodes via Agents.

8. User Authentication

In order to use ProActive Scheduler every user must have an account. It supports two methods for authentication:

  • File based

  • LDAP

8.1. Select authentication method

By default the ProActive Scheduler is configured to use file based authentication and has some default accounts ('demo/demo', 'admin/admin') that work out of the box.

If you would like to use your LDAP server you need to modify two configs:

  • Resource Manager configuration (PROACTIVE_HOME/config/rm/settings.ini)

    #Property that defines the method that has to be used for logging users to the Resource Manager
    #It can be one of the following values:
    #    - "RMFileLoginMethod" to use file login and group management
    #    - "RMLDAPLoginMethod" to use LDAP login management
    pa.rm.authentication.loginMethod=RMLDAPLoginMethod
  • Scheduler configuration (PROACTIVE_HOME/config/scheduler/settings.ini)

    #Property that define the method that have to be used for logging users to the Scheduler
    #It can be one of the following values :
    #	- "SchedulerFileLoginMethod" to use file login and group management
    #	- "SchedulerLDAPLoginMethod" to use LDAP login management
    pa.scheduler.core.authentication.loginMethod=SchedulerLDAPLoginMethod

8.2. File

By default, the ProActive Resource Manager stores users accounts, passwords, and group memberships (user or admin), in two files:

  • users and passwords accounts are stored in PROACTIVE_HOME/config/authentication/login.cfg. Each line has to follow the format user:passwd. The default login.cfg file is given hereafter:

    admin:admin
    user:pwd
    demo:demo
  • users membership is stored in PROACTIVE_HOME/config/authentication/group.cfg. For each user registered in login.cfg, a group membership has to be defined in this file. Each line has to look like user:group. Group has to be user to have user rights, or admin to have administrator rights. Below is an extract of group.cfg:

    admin:admin
    demo:admin
    user:user

8.3. LDAP

The ProActive Resource Manager is able to connect to an existing LDAP server, to check users login/password and verify users group membership. This authentication method can be used with existing LDAP server that is already configured.

In order to use it, few parameters have to be configured, such as path in LDAP tree users, LDAP groups that define user and admin group membership, URL of the LDAP server, LDAP binding method used by connection and configuration of SSL/TLS if you want a secured connection between the ProActive Resource Manager and LDAP.

We assume that LDAP server is configured in the way that:

  • all existing users and groups are located under single domain

  • users have object class specified in parameter pa.ldap.user.objectclass

  • groups have object class specified in parameter pa.ldap.group.objectclass

  • user and group name is defined in cn (Common Name) attribute

# EXAMPLE of user entry
#
# dn: cn=jdoe,dc=example,dc=com
# cn: jdoe
# firstName: John
# lastName: Doe
# objectClass: inetOrgPerson

# EXAMPLE of group entry
#
# dn: cn=mygroup,dc=example,dc=com
# cn: mygroup
# firstName: John
# lastName: Doe
# uniqueMember: cn=djoe,dc=example,dc=com
# objectClass: groupOfUniqueNames

The LDAP configuration is defined in PROACTIVE_HOME/config/authentication/ldap.cfg. You need to:

  1. Set the LDAP server URL

    First, you have to define the LDAP’s URL of your organisation. This address corresponds to the property: pa.ldap.url. You have to put a standard LDAP-like URL, for example ldap://myLdap. You can also set a URL with secure access: ldaps://myLdap:636.

  2. Define object class of user and group entities

    Then you need to define how to differ user and group entities in LDAP tree. The users object class is defined by property pa.ldap.user.objectclass and by default is inetOrgPerson. For groups, the property pa.ldap.group.objectclass has a default value groupOfUniqueNames which could be changed.

  3. Configure LDAP authentication parameters

    By default, the ProActive Scheduler binds to LDAP in anonymous mode. You can change this authentication method by modifying the property pa.ldap.authentication.method. This property can have several values:

    • none (default value) - the ProActive Resource Manager performs connection to LDAP in anonymous mode.

    • simple - the ProActive Resource Manager performs connection to LDAP with a specified login/password (see below for user password setting).

      You can also specify a SASL mechanism for LDAPv3. There are many SASL available mechanisms: cram-md5, digest-md5, kerberos4. Just set this property to sasl to let the ProActive Resource Manager JVM choose SASL authentication mechanism. If you specify an authentication method different from 'none' (anonymous connection to LDAP), you must specify a login/password for authentication.

      There are two properties to set in LDAP configuration file:

      • pa.ldap.bind.login - sets user name for authentication.

      • pa.ldap.bind.pwd - sets password for authentication.

  4. Set SSL/TLS parameters

    A secured SSL/TLS layer can be useful if your network is not trusted, and critical information is transmitted between the rm server and LDAP, such as user passwords. First, set the LDAP URL property pa.ldap.url to a URL of type ldaps://myLdap. Then set pa.ldap.authentication.method to none so as to delegate authentication to SSL.

    For using SSL properly, you have to specify your certificate and public keys for SSL handshake. Java stores certificates in a keyStore and public keys in a trustStore. In most of the cases, you just have to define a trustStore with public key part of LDAP’s certificate. Put certificate in a keyStore, and public keys in a trustStore with the keytool command (keytool command is distributed with standard java platforms):

    keytool -import -alias myAlias -file myCertificate -keystore myKeyStore

    myAlias is the alias name of your certificate, myCertificate is your private certificate file and myKeyStore is the new keyStore file produced in output. This command asks you to enter a password for your keyStore.

    Put LDAP certificate’s public key in a trustStore, with the keytool command:

    keytool -import -alias myAlias -file myPublicKey -keystore myTrustStore

    myAlias is the alias name of your certificate’s public key, myPublicKey is your certificate’s public key file and myTrustore is the new trustStore file produced in output. This command asks you to enter a password for your trustStore.

    Finally, in config/authentication/ldap.cfg, set keyStore and trustStore created before to their respective passwords:

    • Set pa.ldap.keystore.path to the path of your keyStore.

    • Set pa.ldap.keystore.passwd to the password defined previously for keyStore.

    • Set pa.ldap.truststore.path to the path of your trustStore.

    • Set pa.ldap.truststore.passwd to the password defined previously for trustStore.

  5. Use fall back to file authentication

    You can use simultaneously file-based authentication and LDAP-based authentication. Then ProActive Scheduler can check user password and group membership in login and group files, as performed in FileLogin method, if user or group is not found in LDAP. It uses pa.rm.defaultloginfilename and pa.rm.defaultgroupfilename files to authenticate user and check group membership. There are two rules:

    • If LDAP group membership checking fails, fall back to group membership checking with group file. To activate this behavior set pa.ldap.group.membership.fallback to true, in LDAP configuration file.

    • If a user is not found in LDAP, fall back to authentication and group membership checking with login and group files. To activate this behavior, set pa.ldap.authentication.fallback to true, in LDAP configuration file.

9. User Permissions

All users authenticated in the Resource Manager have they own role according to granted permissions. In ProActive Scheduler, we use the standard Java Authentication and Authorization Service (JAAS) to address these needs.

The file PROACTIVE_HOME/config/security.java.policy-server allows to configure fine-grained access for all users, e.g. who has the right to:

  • Deploy ProActive Nodes

  • Execute jobs

  • Pause the Scheduler

  • etc

10. Monitor the cluster state

Cluster monitoring typically means checking that all ProActive Nodes that were added to ProActive Scheduler are up and running. We don’t track for example the free disk space or software upgrade which can be better achieved with tools like Nagios.

In the Resource Manager Web Interface you can see how many ProActive Nodes were added to the Resource Manager and their usage.

admin web

The same information is accessible using the command line:

$ PROACTIVE_HOME/bin/proactive-client --listnodes

10.1. ProActive Node States

When you look at your cluster ProActive Nodes can be in one of the following states

  • Deploying - The deployment of the node has been triggered by the ProActive Resource Manager but it has not yet been added.

  • Lost - The deployment of the node has failed for some reason. The node has never been added to the ProActive Resource Manager and won’t be usable.

  • Configuring - Node has been added to the ProActive Resource Manager and is being configured.

  • Free - Node is available for computations.

  • Busy - Node has been given to user to execute computations.

  • Locked - Node is under maintenance and cannot be used for computations.

  • To be removed - Node is busy but requested to be removed. So it will be removed once the client will release it.

  • Down - Node is unreachable or down and cannot be used anymore.

10.2. JMX

The JMX interface for remote management and monitoring provides information about the running ProActive Resource Manager and allows the user to modify its configuration. For more details about JMX concepts, please refer to official documentation about the JMX architecture.

jmx archi

The following aspects (or services) of the ProActive Scheduler are instrumented using MBeans that are managed through a JMX agent.

  • Server status is exposed using the RuntimeDataMBean

    • The Resource Manager status

    • Available/Free/Busy/Down nodes count

    • Average activity/inactivity percentage

  • The Accounts Manager exposes accounting information using the MyAccountMBean and AllAccountsMBean

    • The used node time

    • The provided node time

    • The provided node count

  • Various management operations are exposed using the ManagementMBean

    • Setting the accounts refresh rate

    • Refresh all accounts

    • Reload the permission policy file

MBean server can be accessed by remote applications using one of the two available connectors

  • The standard solution based on Remote Method Invocation (RMI) protocol is the RMI Connector accessible at the following url: service:jmx:rmi:///jndi/rmi://HOSTNAME:PORT/JMXRMAgent where

    • HOSTNAME is the hostname on which the Resource Manager is started

    • PORT (5822 by default) is the port number on which the JMX RMI connector server has been started. It is defined by the property pa.rm.jmx.port .

  • The ProActive Remote Objects Connector provides ProActive protocol aware connector accessible at the following url: service:jmx:ro:///jndi/PA_PROTOCOL://HOSTNAME:PORT/JMXRMAgent where

    • PA_PROTOCOL is the protocol defined by the proactive.communication.protocol property

    • HOSTNAME is the hostname on which the Resource Manager is started

    • PORT is the protocol dependent port number usually defined by the property proactive.PA_PROTOCOL.port

The name of the connector (JMXRMAgent by default) is defined by the property rm.jmx.connectorname.

The JMX url to connect to can be obtained from the Authentication API of the Resource Manager or by reading the log file located in PROACTIVE_HOME/logs/RM.log. In that log file, the address you have to retrieve is the one where the JMX RMI connector server has been started

[INFO 2010-06-17 10:23:27,813] [RM.AbstractJMXHelper.boot] Started JMX RMI connector server at service:jmx:rmi:///jndi/rmi://kisscool.inria.fr:5822/JMXRMAgent

Once connected, you’ll get an access to Resource Manager statistics and accounting.

For example, to connect to the ProActive Scheduler JMX Agent with JConsole tool, just enter the url of the standard RMI Connector, as well as the username and the password.

jmx jconsole connect

Then depending on the allowed permissions browse the attributes of the MBeans.

jmx jconsole

10.3. Accounting

The users of ProActive Scheduler request and offer nodes for computation. To keep track of how much node time was consumed or contributed by a particular user, ProActive Scheduler associates a user to an account.

More precisely, the nodes can be manipulated by the following basic operations available to the users

  • The ADD operation is a registration of a node in the Resource Manager initiated by a user considered as a node provider. A node can be added, through the API, as a result of a deployment process, through an agent or manually from the command line interface.

  • The REMOVE operation is the unregistration of a node from the Resource Manager. A node can be removed, through the API, by a user or automatically if it is unreachable by the Resource Manager.

  • The GET operation is a node reservation, for an unknown amount of time, by a user considered as a node owner. For example, the ProActive Scheduler can be considered as a user that reserves a node for a task computation.

  • The RELEASE operation on a reserved node by any user.

The following accounting data is gathered by the Resource Manager

  • The used node time: The amount of time other users have spent using the resources of a particular user. More precisely, for a specific node owner, it is the sum of all time intervals from GET to RELEASE.

  • The provided node time: The amount of time a user has offered resources to the Resource Manager. More precisely, for a specific node provider, it is the sum of all time intervals from ADD to REMOVE.

  • The provided node count: The number of provided nodes.

The accounting information can be accessed only through a JMX client or the ProActive Resource Manager command line.

11. Run Computation with a user’s system account

Configure a ProActive Node to execute tasks under a user’s system account by ticking the Run as me box in the task configuration. By default authentication is done through a password, but can also be configured using a SSH key.

For proper execution, the user’s system account must have:

  • Execution rights to the PROACTIVE_HOME directory and all it’s parent directories.

  • Write access to the PROACTIVE_HOME directory.

Logs are written during task execution, as a matter of fact, the executing user needs write access to the PROACTIVE_HOME directory. ProActive does some tests before executing as a different user, those need full execution rights, including the PROACTIVE_HOME parent directories.

Create a ProActive group which owns the PROACTIVE_HOME directory and has write access. Add every system user, which executes tasks under its own system account, to the ProActive group.

Find a step by step tutorial here.

11.1. Using password

The ProActive Node will try to impersonate the user that submitted the task when running it. It means the username and password must be the same between the Scheduler and the operating system.

11.2. Using SSH keys

A SSH key can be tied to the user’s account and used to impersonate the user when running a task (using SSH). The private key of the user must be provided.

To enable this method, set the system property pas.launcher.forkas.method to key when starting a ProActive Node.

On Mac OS X, the default temporary folder is not shared between all users. It is required by this feature. To use a shared temporary folder, you need to set the $TMPDIR environment variable and the Java property java.io.tmpdir to /tmp before starting the ProActive Node.

12. Configure Web applications

The ProActive Scheduler deploys several web applications :

  • Scheduler Web Interface

  • Resource Manager Interface

  • Workflow Studio

  • REST API

All these applications can be found in PROACTIVE_HOME/dist/war and are automatically deployed in an embedded Jetty web server when starting the Scheduler.

To configure Web applications, for instance the HTTP port to use, edit the PROACTIVE_HOME/config/web/settings.ini file.

Web applications are deployed on the host using the HTTP port 8080 by default. The local part of the URL is based on the war files found in PROACTIVE_HOME/dist/war. For instance, the scheduler web portal is available at http://localhost:8080/scheduler.

12.1. Enable HTTPS

If you are familiar with Nginx or similar tools, you can also use it to enable HTTPS connection to Web applications. It will also enable you to redirect HTTP requests to HTTPS for instance.

HTTPS can be enabled for Web applications using a Java keystore. See web.https properties in PROACTIVE_HOME/config/web/settings.ini for the possible settings. We provided a default self-signed keystore than can be used for testing but for production you should build your own keystore with a valid certificate so it will be accepted by browsers.

12.2. Enable VNC remote visualization in the browser

The Rest API web application embeds a proxy to allow the remote display visualization of a ProActive Node running a given task via a VNC server from the browser. To enable this feature, the scheduler has to be configured as follows:

  • configure the proxy: edit the PROACTIVE_HOME/config/web/settings.ini file and set novnc.enabled to true in order to start the proxy.

  • configure the scheduler portal to use the proxy when opening a new tab browser that shows the remote visualization: edit the PROACTIVE_HOME/dist/war/scheduler/scheduler.conf.

    • Set sched.novnc.url to the public address of the proxy. This public address is the public address of the host where the sheduler is started and the port specified by the option novnc.port in the file PROACTIVE_HOME/config/web/settings.ini

    • Set sched.novnc.page.url to the public address of the VNC client to be executed in the browser. The client is located in the REST API with the file novnc.html.

12.3. Workflow Catalog

The Workflow Catalog stores ProActive Workflows through a REST API. It is subdivided into buckets. Each bucket has a unique name and stores zero, one or more versioned ProActive workflows.

By default, ProActive Workflows are persisted on disk using the embedded HSQL database. The data is located in PROACTIVE_HOME/data/db/workflow-catalog.

The Workflow Catalog is a WAR file which contains a configuration file. More information regarding the Workflow Catalog configuration can be found in Workflow Catalog Properties.

A complete documentation of the Workflow Catalog REST API is available by default on:

http://localhost:8080/workflow-catalog

This documentation is automatically generated using Swagger.

13. Configure script engines

Most script engines do not need any configuration. This section talks about the script engines which can be configured individually.

13.1. Docker Compose (Docker task)

The Docker Compose script engine is a wrapper around the Docker Compose command. It needs to have Docker Compose and Docker installed, in order to work. The Docker Compose script engine has a configuration file which configures each ProActive Node individually. The configuration file is in PROACTIVE_HOME/config/scriptengines/docker-compose.properties. The configuration file is in PROACTIVE_HOME/config/scriptengines/docker-compose.properties. docker-compose.properties has the following configuration options:


docker.compose.command=[docker-compose executable example:/usr/local/bin/docker-compose] --- Defines which Docker Compose executable is used by the script engine.


docker.compose.sudo.command=[sudo executable example:/usr/bin/sudo] --- Defines the sudo executable which is used. The sudo property is used to give the Docker Compose command root rights.


docker.compose.use.sudo=false --- Defines whether to execute Docker Compose with sudo [true] or without sudo [false]


docker.host= --- Defines the DOCKER_HOST environment variable. The DOCKER_HOST variable defines the Docker socket which the Docker client connects to. That property is useful when accessing a Docker daemon which runs inside a container or on a remote machine. Further information about the DOCKER_HOST property can be found in the official Docker documentation.

14. Performance tuning

14.1. Database

By default, ProActive Scheduler is configured to use HSQLDB embedded database. This last is lightweight and does not require complex configuration. However, it cannot compete with more conventional database management systems, especially when the load increases. Consequently, it is recommended to setup an RDBMS such as MariaDB, Postgres or SQL Server if you care about performance or notice a slowdown.

ProActive Workflows and Scheduling distribution is provided with several samples for configuring the Scheduler and Resource Manager to use most standard RDBMS. You can find these examples in PROACTIVE_HOME/config/rm/templates/ and PROACTIVE_HOME/config/scheduler/templates/ folders.

14.1.1. Indexes

The Scheduler and RM databases rely on Hibernate to create and update the database schema automatically on startup. Hibernate is configured for adding indexes on frequently used columns, including columns associated to foreign keys. The goal is to provide a default configuration that behaves in a satisfactory manner with the different features available in the Scheduler. For instance, the Jobs housekeeping feature requires indexes on most foreign keys since a Job deletion may imply several cascade delete.

Unfortunately, each database vendor implements its own strategy regarding indexes on foreign keys. Thus, some databases such as MariaDB and MySQL automatically add indexes for foreign keys whereas some others like Postgres or Oracle do not add such indexes by default. Besides, Hibernate does not offer control over this strategy. As a consequence, this difference of behaviour, once linked to our default configuration for Hibernate, may lead to duplicate indexes (two indexes with different names for a same column) that are created for columns in RM and Scheduler tables.

In case you are using in production MySQL, MariaDB or a similar database that automatically adds indexes on foreign keys, it is strongly recommended to ask your DBA to identify and delete duplicate indexes created by the database system since they may hurt performances.

15. Troubleshooting

15.1. Logs

If something goes wrong the first place to look for the problem are the Scheduler logs. By default all logs are in PROACTIVE_HOME/logs directory.

Users submitting jobs have access to server logs of their jobs through the Scheduler Web interface

server logs

15.2. Common Problems

15.2.1. 'Path too Long' Errors When Unzipping Windows Downloads

When you unzip a Windows package using the default Windows compression utility, you might get errors stating that the path is too long. Path length is determined by the Windows OS. The maximum path, which includes drive letter, colon, backslash, name components separated by backslashes, and a terminating null character, is defined as 260 characters.

Workarounds:

  • Move the Zip file on the root level of the system drive and unzip from there.

  • Use a third-party compression utility. Unlike the default Windows compression utility, some third-party utilities allow for longer maximum path lengths.

16. Reference

16.1. Scheduler Properties

Scheduler Properties are read when ProActive Scheduler is started therefore you need to restart it to apply changes.

The default configuration file is SCHEDULER_HOME/scheduler/settings.ini.

# Scheduler home directory (this default value should be proper in most cases)
pa.scheduler.home=.

# Timeout for the scheduling loop (in millisecond)
pa.scheduler.core.timeout=2000

# Auto-reconnection to the Resource Manger
pa.scheduler.core.rmconnection.autoconnect = true
pa.scheduler.core.rmconnection.timespan = 5000
pa.scheduler.core.rmconnection.attempts = 20

# Number of threads used to execute client requests
pa.scheduler.core.clientpoolnbthreads=5

# Number of threads used to execute internal scheduling operations
pa.scheduler.core.internalpoolnbthreads=5

# Check for failed node frequency (in second)
pa.scheduler.core.nodepingfrequency=20

# Cache classes definition in task class servers
pa.scheduler.classserver.usecache=true;

# Scheduler default policy full name
pa.scheduler.policy=org.ow2.proactive.scheduler.policy.DefaultPolicy

# Defines the maximum number of tasks to be scheduled in each scheduling loop.
pa.scheduler.policy.nbtaskperloop=10

#Name of the JMX MBean for the scheduler
pa.scheduler.core.jmx.connectorname=JMXSchedulerAgent

# port of the JMX service for the Scheduler.
pa.scheduler.core.jmx.port=5822

# Accounting refresh rate from the database in seconds
pa.scheduler.account.refreshrate=180

# RRD data base with statistic history
pa.scheduler.jmx.rrd.name=scheduler_statistics.rrd

# RRD data base step in seconds
pa.scheduler.jmx.rrd.step=4

# User session time (user is automatically disconnect after this time if no request is made to the scheduler)
# negative number indicates that session is infinite (value specified in second)
pa.scheduler.core.usersessiontime=3600

# Timeout for the start task action. Time during which the scheduling could be waiting (in millis)
# this value relies on the system and network capacity
pa.scheduler.core.starttask.timeout=5000

# Maximum number of threads used for the start task action. This property define the number of blocking resources
# until the scheduling loop will block as well.
# As it is related to the number of nodes, this property also define the number of threads used to terminate taskLauncher
pa.scheduler.core.starttask.threadnumber=5

# Maximum number of threads used to send events to clients. This property defines the number of clients
# than can block at the same time. If this number is reached, every clients won't receive events until
# a thread unlock.
pa.scheduler.core.listener.threadnumber=5

#-------------------------------------------------------
#----------------   JOBS PROPERTIES   ------------------
#-------------------------------------------------------

# Job multiplicative factor. (Task id will be jobId*this_factor+taskId)
pa.scheduler.job.factor=10000

# Remove job delay (in second). (The time between getting back its result and removing it from the scheduler)
# Set this time to 0 if you don't want the job to be remove.
pa.scheduler.core.removejobdelay=0

# Automatic remove job delay (in second). (The time between the termination of the job and removing it from the scheduler)
# Set this time to 0 if you don't want the job to be remove automatically.
pa.scheduler.core.automaticremovejobdelay=0

# Remove job in dataBase when removing it from scheduler.
pa.scheduler.job.removeFromDataBase=false

#-------------------------------------------------------
#---------------   TASKS PROPERTIES   ------------------
#-------------------------------------------------------
# Initial time to wait before the re-execution of a task. (in millisecond)
pa.scheduler.task.initialwaitingtime=1000

# Maximum number of execution for a task in case of failure (node down)
pa.scheduler.task.numberofexecutiononfailure=2

# If true tasks are ran in a forked JVM, if false they are ran in the node's JVM
pa.scheduler.task.fork=true

#-------------------------------------------------------
#-------------   DATASPACES PROPERTIES   ---------------
#-------------------------------------------------------

# Default INPUT space URL. The default INPUT space is used inside each job that does not define an INPUT space.
# Normally, the scheduler will start a FileSystemServer on a default location based on the TEMP directory.
# If the following property is specified, this FileSystemServer will be not be started and instead the provided dataspace
# url will be used
#pa.scheduler.dataspace.defaultinput.url=

# The following property can be used in two ways.
# 1) If a "pa.scheduler.dataspace.defaultinput.url" is provided, the defaultinput.path property
#   tells the scheduler where the actual file system is (provided that he has access to it). If the scheduler does not have
#   access to the file system where this dataspace is located then this property must not be set.
#       - On windows, use double backslash in the path, i.e. c:\\users\\...
#       - you can provide a list of urls separated by spaces , i.e. : http://myserver/myspace file:/path/to/myspace
#       - if one url contain spaces, wrap all urls in the list between deouble quotes :
#               "http://myserver/myspace"  "file:/path/to/my space"
# 2) If a "pa.scheduler.dataspace.defaultinput.url" is not provided, the defaultinput.path property will tell the scheduler
#   to start a FileSystemServer on the provided defaultinput.path instead of its default location

### the default location is SCHEDULER_HOME/data/defaultinput
#pa.scheduler.dataspace.defaultinput.localpath=

# Host name from which the localpath is accessible, it must be provided if the localpath property is provided
#pa.scheduler.dataspace.defaultinput.hostname=

# The same for the OUPUT (see above explanations in the INPUT SPACE section)
# (concerning the syntax, see above explanations in the INPUT SPACE section)
#pa.scheduler.dataspace.defaultoutput.url=
### the default location is SCHEDULER_HOME/data/defaultoutput
#pa.scheduler.dataspace.defaultoutput.localpath=
#pa.scheduler.dataspace.defaultoutput.hostname=

# The same for the GLOBAL space. The GLOBAL space is shared between each users and each jobs.
# (concerning the syntax, see above explanations in the INPUT SPACE section)
#pa.scheduler.dataspace.defaultglobal.url=
### the default location is SCHEDULER_HOME/data/defaultglobal
#pa.scheduler.dataspace.defaultglobal.localpath=
#pa.scheduler.dataspace.defaultglobal.hostname

# The same for the USER spaces. A USER space is a per-user global space. An individual space will be created for each user in subdirectories of the defaultuser.localpath.
# Only one file server will be created (if not provided)
# (concerning the syntax, see above explanations in the INPUT SPACE section)
#pa.scheduler.dataspace.defaultuser.url=
### the default location is SCHEDULER_HOME/data/defaultuser
#pa.scheduler.dataspace.defaultuser.localpath=
#pa.scheduler.dataspace.defaultuser.hostname=

#-------------------------------------------------------
#----------------   LOGS PROPERTIES   ------------------
#-------------------------------------------------------
# Logs forwarding method
# Possible methods are :
# Simple socket : org.ow2.proactive.scheduler.common.util.logforwarder.providers.SocketBasedForwardingProvider
# SSHTunneled socket : org.ow2.proactive.scheduler.common.util.logforwarder.providers.SocketWithSSHTunnelBasedForwardingProvider
# ProActive communication : org.ow2.proactive.scheduler.common.util.logforwarder.providers.ProActiveBasedForwardingProvider
#
# set this property to empty string to disable log forwarding alltogether
pa.scheduler.logs.provider=org.ow2.proactive.scheduler.common.util.logforwarder.providers.ProActiveBasedForwardingProvider
# Location of server jobs logs (comment to disable job logging to separate files). Can be an absolute path.
pa.scheduler.job.logs.location=logs/jobs/

# Size limit for job and task logs in bytes
pa.scheduler.job.logs.max.size=10000

#-------------------------------------------------------
#-----------   AUTHENTICATION PROPERTIES   -------------
#-------------------------------------------------------

# path to the Jaas configuration file which defines what modules are available for internal authentication
pa.scheduler.auth.jaas.path=config/authentication/jaas.config

# path to the private key file which is used to encrypt credentials for authentication
pa.scheduler.auth.privkey.path=config/authentication/keys/priv.key

# path to the public key file which is used to encrypt credentials for authentication
pa.scheduler.auth.pubkey.path=config/authentication/keys/pub.key

# LDAP Authentication configuration file path, used to set LDAP configuration properties
# If this file path is relative, the path is evaluated from the Scheduler dir (ie application's root dir)
# with the variable defined below : pa.scheduler.home.
# else, (if the path is absolute) it is directly interpreted
pa.scheduler.ldap.config.path=config/authentication/ldap.cfg

# Login file name for file authentication method
# If this file path is relative, the path is evaluated from the Scheduler dir (ie application's root dir)
# with the variable defined below : pa.scheduler.home.
# else, the path is absolute, so the path is directly interpreted
pa.scheduler.core.defaultloginfilename=config/authentication/login.cfg

# Group file name for file authentication method
# If this file path is relative, the path is evaluated from the Scheduler dir (ie application's root dir)
# with the variable defined below : pa.scheduler.home.
# else, the path is absolute, so the path is directly interpreted
pa.scheduler.core.defaultgroupfilename=config/authentication/group.cfg

#Property that define the method that have to be used for logging users to the Scheduler
#It can be one of the following values :
#	- "SchedulerFileLoginMethod" to use file login and group management
#	- "SchedulerLDAPLoginMethod" to use LDAP login management
pa.scheduler.core.authentication.loginMethod=SchedulerFileLoginMethod

#-------------------------------------------------------
#------------------   RM PROPERTIES   ------------------
#-------------------------------------------------------
# Path to the Scheduler credentials file for RM authentication
pa.scheduler.resourcemanager.authentication.credentials=config/authentication/scheduler.cred

# Use single or multiple connection to RM :
# (If true)  the scheduler user will do the requests to rm
# (If false) each Scheduler users have their own connection to RM using their scheduling credentials
pa.scheduler.resourcemanager.authentication.single=true

# Set a timeout for initial connection to the RM connection (in ms)
pa.scheduler.resourcemanager.connection.timeout=120000

#-------------------------------------------------------
#--------------   HIBERNATE PROPERTIES   ---------------
#-------------------------------------------------------
# Hibernate configuration file (relative to home directory)
pa.scheduler.db.hibernate.configuration=config/scheduler/database.cfg.xml

# Drop database before creating a new one
# If this value is true, the database will be dropped and then re-created
# If this value is false, database will be updated from the existing one.
pa.scheduler.db.hibernate.dropdb=false

# This property is used to limit number of finished jobs loaded from the database
# at scheduler startup. For example setting this property to '10d' means that
# scheduler should load only finished jobs which were submitted during last
# 10 days. In the period expression it is also possible to use symbols 'h' (hours)
# and 'm' (minutes).
# If property isn't set then all finished jobs are loaded.
pa.scheduler.db.load.job.period=

# Set to true to enable email notificaions about finished jobs. Emails
# are sent to the address specified in the generic information of a
# job with the key EMAIL; example:
#    <genericInformation>
#        <info name="EMAIL" value="user@example.com"/>
#    </genericInformation>
pa.scheduler.notifications.email.enabled=false
# From address for notificaions emails (set it to a valid address if
# you would like email notifications to work)
pa.scheduler.notifications.email.from=

16.2. Resources Manager Properties

Resource Manager Properties are read when ProActive Scheduler is started therefore you need to restart it to apply changes.

The default configuration file is SCHEDULER_HOME/rm/settings.ini.

# definition of all java properties used by resource manager
# warning : definition of these variables can be override by user at JVM startup,
# using for example -Dpa.rm.home=/foo, in the java command

# name of the ProActive Node containing RM's active objects
pa.rm.node.name=RM_NODE

# number of local nodes to start with the Resource Manager
pa.rm.local.nodes.number=4

# ping frequency used by node source for keeping a watch on handled nodes (in ms)
pa.rm.node.source.ping.frequency=45000

# ping frequency used by resource manager to ping connected clients (in ms)
pa.rm.client.ping.frequency=45000

# The period of sending "alive" event to resource manager's listeners (in ms)
pa.rm.aliveevent.frequency=300000

# timeout for selection script result
pa.rm.select.script.timeout=30000

# number of selection script digests stored in the cache to predict the execution results
pa.rm.select.script.cache=10000

# The time period when a node has the same dynamic characteristics (in ms).
# It needs to pause the permanent execution of dynamic scripts on nodes.
# Default is 5 mins, which means that if any dynamic selection scripts returns
# false on a node it won't be executed there at least for this time.
pa.rm.select.node.dynamicity=300000

# The full class name of the policy selected nodes
pa.rm.selection.policy=org.ow2.proactive.resourcemanager.selection.policies.ShufflePolicy

# Timeout for remote script execution (in ms)
pa.rm.execute.script.timeout=180000

# If set to non-null value the resource manager executes only scripts from this directory.
# All other selection scripts will be rejected.
pa.rm.select.script.authorized.dir=

# timeout for node lookup
pa.rm.nodelookup.timeout=30000

# GCM application (GCMA) file path, used to perform GCM deployments
# If this file path is relative, the path is evaluated from the Resource manager dir (ie application's root dir)
# defined by the "pa.rm.home" JVM property
# else, the path is absolute, so the path is directly interpreted
pa.rm.gcm.template.application.file=config/rm/deployment/GCMNodeSourceApplication.xml

# java property string defined in the GCMA defined above, which is dynamically replaced
# by a GCM deployment descriptor file path to deploy
pa.rm.gcmd.path.property.name=gcmd.file

# Resource Manager home directory
pa.rm.home=.

# Lists of supported infrastructures in the resource manager
pa.rm.nodesource.infrastructures=config/rm/nodesource/infrastructures

# Lists of supported node acquisition policies in the resource manager
pa.rm.nodesource.policies=config/rm/nodesource/policies

# Max number of threads in node source for parallel task execution
pa.rm.nodesource.maxthreadnumber=50

# Max number of threads in selection manager
pa.rm.selection.maxthreadnumber=50

# Max number of threads in monitoring
pa.rm.monitoring.maxthreadnumber=5

# Number of threads in the node cleaner thread pool
pa.rm.cleaning.maxthreadnumber=5

#Name of the JMX MBean for the RM
pa.rm.jmx.connectorname=JMXRMAgent

#port of the JMX service for the RM.
pa.rm.jmx.port=5822

#Accounting refresh rate from the database in seconds (0 means disabled)
pa.rm.account.refreshrate=180

# RRD data base with statistic history
pa.rm.jmx.rrd.name=rm_statistics.rrd

# RRD data base step in seconds
pa.rm.jmx.rrd.step=4

# path to the Amazon EC2 account credentials properties file,
# mandatory when using the EC2 Infrastructure
pa.rm.ec2.properties=config/rm/deployment/ec2.properties

#-------------------------------------------------------
#---------------   AUTHENTICATION PROPERTIES   ------------------
#-------------------------------------------------------

# path to the Jaas configuration file which defines what modules are available for internal authentication
pa.rm.auth.jaas.path=config/authentication/jaas.config

# path to the private key file which is used to encrypt credentials for authentication
pa.rm.auth.privkey.path=config/authentication/keys/priv.key

# path to the public key file which is used to encrypt credentials for authentication
pa.rm.auth.pubkey.path=config/authentication/keys/pub.key

# LDAP Authentication configuration file path, used to set LDAP configuration properties
# If this file path is relative, the path is evaluated from the resource manager dir (ie application's root dir)
# with the variable defined below : pa.rm.home.
# else, (if the path is absolute) it is directly interpreted
pa.rm.ldap.config.path=config/authentication/ldap.cfg

# Login file name for file authentication method
# If this file path is relative, the path is evaluated from the resource manager dir (ie application's root dir)
# with the variable defined below : pa.rm.home.
# else, the path is absolute, so the path is directly interpreted
pa.rm.defaultloginfilename=config/authentication/login.cfg

# Group file name for file authentication method
# If this file path is relative, the path is evaluated from the resource manager dir (ie application's root dir)
# with the variable defined below : pa.rm.home.
# else, the path is absolute, so the path is directly interpreted
pa.rm.defaultgroupfilename=config/authentication/group.cfg

# Property that define the method that have to be used for logging users to the resource manager
# It can be one of the following values :
#	 - "RMFileLoginMethod" to use file login and group management
#	 - "RMLDAPLoginMethod" to use LDAP login management
pa.rm.authentication.loginMethod=RMFileLoginMethod

# Path to the rm credentials file for authentication
pa.rm.credentials=config/authentication/rm.cred

#-------------------------------------------------------
#--------------   HIBERNATE PROPERTIES   ---------------
#-------------------------------------------------------
# Hibernate configuration file (relative to home directory)
pa.rm.db.hibernate.configuration=config/rm/database.cfg.xml

# Drop database before creating a new one
# If this value is true, the database will be dropped and then re-created
# If this value is false, database will be updated from the existing one.
pa.rm.db.hibernate.dropdb=false

# Drop only node sources from the data base
pa.rm.db.hibernate.dropdb.nodesources=false

#-------------------------------------------------------
#--------------   TOPOLOGY  PROPERTIES   ---------------
#-------------------------------------------------------
pa.rm.topology.enabled=true

# Pings hosts using standard InetAddress.isReachable() method.
pa.rm.topology.pinger.class=org.ow2.proactive.resourcemanager.frontend.topology.pinging.HostsPinger
# Pings ProActive nodes using Node.getNumberOfActiveObjects().
#pa.rm.topology.pinger.class=org.ow2.proactive.resourcemanager.frontend.topology.pinging.NodesPinger

# Location of selection scripts logs (comment to disable job logging to separate files). Can be an absolute path.
pa.rm.logs.selection.location=logs/jobs/

16.3. Network Properties

Configuration files related to network properties for the Scheduler and nodes are located respectively in:

  • PROACTIVE_HOME/config/network/server.ini

  • PROACTIVE_HOME/config/network/node.ini

If you are using ProActive agents, then the configuration file location differs depending of the Operating System:

  • On Unix OS, the default location is /opt/proactive-node/config/network/node.ini

  • On Windows OS, the default one is C:\Program Files (x86)\ProActiveAgent\schedworker\config\network\node.ini.

16.3.1. Common Network Properties

The protocol to use by the Server and the nodes can be configured by setting a value to the property proactive.communication.protocol. It represents the protocol used to export objects on remote JVMs. At this stage, several protocols are supported: PNP (pnp), PNP over SSL (pnps), ProActive Message Routing (pamr).

The Scheduler is only able to bind to one and only one address. Usually, this limitation is not seen by the user and no special configuration is required. The Scheduler tries to use the most suitable network address available. But sometimes, the Scheduler fails to elect the right IP address or the user wants to use a given IP address. In such case, you can specify the IP address to use by using theses properties: proactive.hostname, proactive.net.interface, proactive.net.netmask, proactive.net.nolocal, proactive.net.noprivate.

IPv6 can be enabled by setting the proactive.net.disableIPv6 property to false . By default, the Scheduler does not use IPv6 addresses.

If none of the proactive.hostname, proactive.net.interface, proactive.net.netmask, proactive.net.nolocal , proactive.net.noprivate properties is defined, then the following algorithm is used to elect an IP address:

  • If a public IP address is available, then use it. If several ones are available, one is randomly chosen.

  • If a private IP address is available, then use it. If several ones are available, one is randomly chosen.

  • If a loopback IP address is available, then use it. If several ones are available, one is randomly chosen.

  • If no IP address is available at all, then the runtime exits with an error message.

  • If proactive.hostname is set, then the value returned by InetAddress.getByName(proactive.hostname) is elected. If no IP address is found, then the runtime exits with an error message.

If proactive.hostname is not set, and at least one of the proactive.net.interface , proactive.net.netmask , proactive.net.nolocal , proactive.net.noprivate is set, then one of the addresses matching all the requirements is elected. Requirements are:

  • If proactive.net.interface is set, then the IP address must be bound to the given network interface.

  • If proactive.net.netmask is set, then the IP address must match the given netmask.

  • If proactive.net.nolocal is set, then the IP address must not be a loopback address.

  • If proactive.net.noprivate is set, then the IP address must not be a private address.

  • proactive.useIPaddress: If set to true, IP addresses will be used instead of machines names. This property is particularly useful to deal with sites that do not host a DNS.

  • proactive.hostname: When this property is set, the host name on which the JVM is started is given by the value of the property. This property is particularly useful to deal with machines with two network interfaces.

16.3.2. PNP Protocol Properties

PNP allows the following options:

  • proactive.pnp.port: The TCP port to bind to. If not set PNP uses a random free port. If the specified TCP port is already used, PNP will not start and an error message is displayed.

  • proactive.pnp.default_heartbeat: PNP uses heartbeat messages to monitor the TCP socket and discover network failures. This value determines how long PNP will wait before the connection is considered broken. Heartbeat messages are usually sent every default_heartbeat/2 ms. This value is a trade-off between fast error discovery and network overhead. The default value is 30000 ms. Setting this value to 0 disables the heartbeat mechanism and client will not be advertised of network failure before the TCP timeout (which can be really long).

  • proactive.pnp.idle_timeout: PNP channels are closed when unused to free system resources. Establishing a TCP connection is costly (at least 3 RTT) so PNP connections are not closed immediately but after a grace time. By default the grace time is 60 000 ms. Setting this value to 0 disables the autoclosing mechanism, connections are kept open forever.

16.3.3. PNP over SSL Properties

PNPS support the same options as PNP (in its own option name space) plus some SSL specific options:

  • proactive.pnps.port: same as proactive.pnp.port

  • proactive.pnps.default_heartbeat: same as proactive.pnp.default_heartbeat

  • proactive.pnps.idle_timeout: same as proactive.pnp.idle_timeout

  • proactive.pnps.authenticate: By default, PNPS only ciphers the communication but does not authenticate nor the client nor the server. Setting this option to true enable client and server authentication. If set to true the option proactive.pnps.keystore must also be set.

  • proactive.pnps.keystore: Specify the keystore (containing the SSL private key) to use. The keystore must be of type PKCS12. If not set a private key is dynamically generated for this execution. Below is an example for creating a keystore by using the keytool binary that is shipped with Java:

    $JAVA_HOME/bin/keytool -genkey -keyalg RSA -keystore keystore.jks \
           -validity 365 -keyalg RSA -keysize 2048 -storetype pkcs12
  • proactive.pnps.keystore.password: the password associated to the keystore used by PNPS.

When using the authentication and ciphering mode (or to speed up the initialization in ciphering only mode), the option proactive.pnps.keystore must be set and a keystore embedding the private SSL key must be generated. This keystore must be accessible to ProActive services but kept secret to others. The same applies to the configuration file that contains the keystore password defined with property proactive.pnps.keystore.password.

16.3.4. PAMR Protocol Properties

PAMR options are listed below:

  • proactive.pamr.router.address: The address of the router to use. Must be set if message routing is enabled. It can be FQDN or an IP address.

  • proactive.pamr.router.port: The port of the router to use. Must be set if message routing is enabled.

  • proactive.pamr.socketfactory: The Socket Factory to use by the message routing protocol

  • proactive.pamr.connect_timeout: Sockets used by the PAMR remote object factory connect to the remote server with a specified timeout value. A timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

  • proactive.pamr.agent.id: This property can be set to obtain a given (and fixed) agent ID. This id must be declared in the router configuration and must be between 0 and 4096.

  • proactive.pamr.agent.magic_cookie: The Magic cookie to submit to the router. If proactive.pamr.agent.id is set, then this property must also be set to be able to use a reserved agent ID.

PAMR over SSH protocol properties

To enable PAMR over SSH, ProActive nodes hosts should be able to SSH to router’s host without password, using SSH keys.

  • proactive.pamr.socketfactory: The underlying Socket factory, should be ssh to enable PAMR over SSH

  • proactive.pamrssh.port: SSH port to use when connecting to router’s host

  • proactive.pamrssh.username: username to use when connecting to router’s host

  • proactive.pamrssh.key_directory: directory when SSH keys can be found to access router’s host. For instance /home/login/.ssh

16.3.5. Enabling Several Communication Protocols

The next options are available to control multiprocol:

  • proactive.communication.additional_protocols: The set of protocol to use separated by commas.

  • proactive.communication.benchmark.parameter: This property is used pass parameters to the benchmark. This could be a duration time, a size, …​ This property is expressed as a String.

  • proactive.communication.protocols.order: A fixed order could be specified if protocol’s performance is known in advance and won’t change. This property explain a preferred order for a subset of protocols declared in the property proactive.communication.additional_protocols. If one of the specified protocol isn’t exposed, it is ignored. If there are protocols that are not declared in this property but which are exposed, they are used in the order choose by the benchmark mechanism.

    Example :
    Exposed protocols : http,pnp,rmi
    Benchmark Order : rmi > pnp > http
    Order : pnp
    This will give the order of use : pnp > rmi > http
  • proactive.communication.protocols.order: A fixed order could be specified if protocol’s performance is known in advance and won’t change. This automatically disabled RemoteObject’s Benchmark.

16.4. REST API & Web Properties

REST API Properties are read when ProActive Scheduler is started therefore you need to restart it to apply changes.

The default configuration file is SCHEDULER_HOME/web/settings.ini.

### Web applications configuration ###

# web applications in dist/war are deployed by default
web.deploy=true

# port to use to deploy web applications
web.port=8080

# HTTPS/SSL configuration
web.https=false
# path to keystore, can be absolute or relative to SCHEDULER_HOME
web.https.keystore=config/web/keystore
web.https.keystore.password=activeeon

### REST API configuration ###

# will be set by JettyStarter, you will need to set it if you run REST server in standalone mode
#scheduler.url=rmi://localhost:1099

# scheduler user that is used as cache
scheduler.cache.login=watcher
scheduler.cache.password=w_pwd
scheduler.cache.credential=

#cache refresh rate in ms
rm.cache.refreshrate=3500

 #will be set by JettyStarter, you will need to set it if you run REST server in standalone mode
#rm.url=rmi://localhost:1099

# rm user that is used as cache
rm.cache.login=watcher
rm.cache.password=w_pwd
rm.cache.credential=

scheduler.logforwardingservice.provider=org.ow2.proactive.scheduler.common.util.logforwarder.providers.SocketBasedForwardingProvider

#### noVNC integration ####
# enable or disable websocket proxy (true or false)
novnc.enabled=false
# port used by websocket proxy (integer)
novnc.port=5900
# security configuration SSL (ON or OFF or REQUIRED)
novnc.secured=ON
# security keystore for SSL
# to create one for development: keytool -genkey -keyalg RSA -alias selfsigned -keystore keystore.jks -storepass password -validity 360 -keysize 2048
novnc.keystore=keystore.jks
# security keystore password
novnc.password=password
# security keystore key password
novnc.keypassword=password

16.5. Workflow Catalog Properties

Workflow Catalog Properties are read from its WAR file.

The default configuration file is located at WEB-INF/classes/application.properties. It can be extended with additional Spring properties.

# Configure logging level
logging.level.org.hibernate.SQL=off
logging.level.org.ow2.proactive.workflow_catalog.rest.controller=info
logging.level.org.ow2.proactive.workflow_catalog.rest.query=info
logging.level.org.springframework.web=info

# Buckets that must be created on first run
pa.workflow_catalog.default.buckets=Templates

# Embedded server configuration
server.compression.enabled=true
server.contextPath=/

# DataSource settings: set here your own configurations for the database connection.
#spring.datasource.driverClassName=org.hsqldb.jdbc.JDBCDriver
#spring.datasource.url=jdbc:hsqldb:file:/tmp/proactive/workflow-catalog;create=true;hsqldb.tx=mvcc;hsqldb.applog=1;hsqldb.sqllog=0;hsqldb.write_delay=false
#spring.datasource.username=root
#spring.datasource.password=

# Hibernate ddl auto (create, create-drop, update)
spring.jpa.hibernate.ddl-auto=update
# Show or not log for each sql query
spring.jpa.show-sql=false

# Disable Spring banner
spring.main.banner_mode=off

16.6. Node Sources

The ProActive Resource Manager supports ProActive Nodes aggregation from heterogeneous environments. As a node is just a process running somewhere, the process of communication to such nodes is unified. The only part which has to be defined is the procedure of nodes deployment which could be quite different depending on infrastructures and their limitations. After installation of the server and node parts it is possible to configure an automatic nodes deployment. Basically, you can say to the Resource Manager how to launch nodes and when.

In the ProActive Resource Manager, there is always a default node source consisted of DefaultInfrastructureManager and Static policy. It is not able to deploy nodes anywhere but makes it possible to add existing nodes to the RM.

16.6.1. Node Source Infrastructure

An Infrastructure Manager is responsible for deploying nodes. In most of the cases it knows how to launch a node on a remote machine. For instance the SSH infrastructure manager connects via SSH to a remote machine and launches nodes.

Default Infrastructure Manager

Default infrastructure manager is designed to be used with ProActive agent. It cannot perform an automatic deployment but any users (including an agent) can add already existing nodes into it. In order to create a node source with this infrastructure, run the following command:

$ PROACTIVE_HOME/bin/proactive-client --createns defaultns -infrastructure org.ow2.proactive.resourcemanager.nodesource.infrastructure.DefaultInfrastructureManager rmURL

The only parameter to provide is the following one:

  • rmURL - the URL of the Resource Manager.

Local Infrastructure Manager

Local Infrastructure Manager can be used to start nodes locally, i.e, on the host running the Resource Manager. In order to create a node source with this infrastructure, run the following command:

$ PROACTIVE_HOME/bin/proactive-client --createns localns -infrastructure org.ow2.proactive.resourcemanager.nodesource.infrastructure.LocalInfrastructure rmURL credentialsPath numberOfNodes timeout javaProperties
  • rmURL - URL of the Resource Manager server (can leave it empty to use default value, i.e, the URL of the Resource Manager you connect to)

  • credentialsPath - The absolute path of the credentials file used to set the provider of the nodes.

  • numberOfNodes - The number of nodes to deploy.

  • timeout - The length in ms after which one a node is not expected anymore.

  • javaProperties - The java properties to setup the ProActive environment for the nodes

SSH Infrastructure

This infrastructure allows to deploy nodes over SSH. This infrastructure needs 10 arguments, described hereafter:

  • RM URL - The Resource Manager’s URL that will be used by the deployed nodes to register by themselves.

  • SSH Options - Options you can pass to the SSHClient executable ( -l inria to specify the user for instance )

  • Java Path - Path to the java executable on the remote hosts.

  • Scheduling Path - Path to the Scheduler installation directory on the remote hosts.

  • Node Time Out - A duration after which one the remote nodes are considered to be lost.

  • Attempt - The number of time the Resource Manager tries to acquire a node for which one the deployment fails before discarding it forever.

  • Target OS - One of 'LINUX', 'CYGWIN' or 'WINDOWS' depending on the machines' ( in Hosts List file ) operating system.

  • Java Options - Java options appended to the command used to start the node on the remote host.

  • RM Credentials Path - The absolute path of the 'rm.cred' file to make the deployed nodes able to register to the Resource Manager ( config/authentication/rm.cred ).

  • Hosts List - Path to a file containing the hosts on which resources should be acquired. This file should contain one host per line, described as a host name or a public IP address, optionally followed by a positive integer describing the number of runtimes to start on the related host (default to 1 if not specified). Example:

    rm.example.com
    test.example.net 5
    192.168.9.10 2
CLI Infrastructure

This generic infrastructure allows to deploy nodes using deployment script written in arbitrary language. The infrastructure just launches this script and waits until the ProActive node is registered in the Resource Manager. Command line infrastructure could be used when you prefer to describe the deployment process using shell scripts instead of Java. Script examples can be found in PROACTIVE_HOME/samples/scripts/deployment. The deployment script has 4 parameters: HOST_NAME, NODE_NAME, NODE_SOURCE_NAME, RM_URL. The removal script has 2 parameters: HOST_NAME and NODE_NAME.

This infrastructure needs 7 arguments, described hereafter:

  • RM URL - The Resource Manager’s URL that will be used by the deployed nodes to register by themselves.

  • Interpreter - Path to the script interpreter (bash by default).

  • Deployment Script - A script that launches a ProActive node and register it to the RM.

  • Removal Script - A script that removes the node from the Resource Manager.

  • Hosts List - Path to a file containing the hosts on which resources should be acquired. This file should contain one host per line, described as a host name or a public IP address, optionally followed by a positive integer describing the number of runtimes to start on the related host (default to 1 if not specified). Example:

    rm.example.com
    test.example.net 5
    192.168.9.10 2
  • Node Time Out - The length in ms after which one a node is not expected anymore.

  • Max Deployment Failure - the number of times the resource manager tries to relaunch the deployment script in case of failure.

EC2 Infrastructure

The Elastic Compute Cloud, aka EC2, is an Amazon Web Service, that allows its users to use machines (instances) on demand on the cloud. An EC2 instance is a Xen virtual machine, running on different kinds of hardware, at different prices, but always paid by the hour, allowing lots of flexibility. Being virtual machines, instances can be launched using custom operating system images, called AMI (Amazon Machine Image). For the Resource Manager to use EC2 instances as computing nodes, a specific EC2 Infrastructure as well as AMI creation utilities are provided.

To use the EC2 Infrastructure in the Resource Manager, proper Amazon credentials are needed. This section describes briefly how to obtain them, and how to use them to configure your environment.

  1. First, you need to create an AWS account at http://aws.amazon.com/.

  2. With your new AWS account, sign up for EC2 at http://aws.amazon.com/ec2/.

  3. Now, you need to obtain the credentials. On the AWS website, point your browser to the Your Web Services Account button, a drop down list displays. Click View Access Key Identifiers.

  4. Use this information to fill in the properties aws_key (Access Key), aws_secret_key (Secret Key) in the Create Node Source panel located in the Resource Manager. Those two parameters should never change, except if you need for some reason to handle multiple EC2 accounts. Other properties in the Create Node Source are:

    • rmHostname: The hostname or the public IP address of the Resource Manager. This address needs to be accessible from the AWS cloud.

    • connectorIaasURL: Connector-iaas is a service embedded in the Scheduler used to interact with IaaS like AWS. By default it runs on the following URL rmHostname/connector-iaas.

    • image: Defines which AMI will be used to create an AWS instances. The value to provide is the AWS region together with the unique AMI Id, for example: eu-west-1/ami-bff32ccc.

    • numberOfInstances: Total number of AWS instances to create for this infrastructure.

    • numberOfNodesPerInstance: Total number of Proactive Nodes to deploy in each created AWS instance.

      If all the nodes of an AWS instance are removed, the instance will be terminated. For more information on the terminated state in AWS please see AWS Terminating Instances.
    • downloadCommand: The command to download the Proactive node.jar. This command is executed in all the newly created AWS instances. The full URL path of the node.jar to download, needs to be accessible from the AWS cloud. Example based on AWS image with windows operating system:

      powershell -command "& { (New-Object Net.WebClient).DownloadFile('try.activeeon.com/rest/node.jar', 'node.jar') }"
    • additionalProperties: Additional Java command properties to be added when starting each ProActive node JVM in the AWS instances (e.g. \"-Dpropertyname=propertyvalue\").

    • minRam: The minimum required amount of RAM expressed in Mega Bytes for each AWS instance that needs to be created.

    • minCores: The minimum required amount of virtual cores for each AWS instance that needs to be created.

    If the combination between RAM and CORES does not match any existing AWS instance type, then the closest to the specified parameters will be selected.

Using this configuration, you can start a Resource Manager and a Scheduler using the /bin/proactive-server script. An EC2 NodeSource can now be added using the Create Node Source panel in the Resource Manager or the command line interface:

$ PROACTIVE_HOME/bin/proactive-client --createns ec2 -infrastructure org.ow2.proactive.resourcemanager.nodesource.infrastructure.AWSEC2Infrastructure aws_key aws_secret_key rmDomain connectorIaasURL image numberOfInstances numberOfNodesPerInstance downloadCommand additionalProperties minRam minCores

As AWS is a paying service, when the Scheduler is stopped normally (without removing the created infrastructure), all the created AWS instances will be terminated. And when the Scheduler is restarted, these instances will be re-configured as per previous settings.

If the Scheduler is forced killed, the created AWS instances will not be terminated. And, when the Scheduler is restarted, the infrastructure will be re-configured as per previous settings. If the instances were delete at the AWS side, they will be re-created and re-configured.
OpenStack Infrastructure

To use the OpenStack instances as computing nodes, a specific OpenStack Infrastructure can be created using the Resource Manager. This section describes briefly how to make it.

  1. First, you need to have an admin account on your OpenStack server. For more information see OpenStack users and tenants.

  2. Use the login, tenant and password information to fill in the properties openstack_username (tenant:login), openstack_password in the Create Node Source panel located in the Resource Manager. Those two parameters should never change, except if you need for some reason to handle multiple OpenStack accounts. Other properties in the Create Node Source are:

    • endpoint: The hostname or the IP address of the OpenStack server. This address needs to be accessible from the Resource Manager.

    • rmHostname: The hostname or the public IP address of the Resource Manager. This address needs to be accessible from the OpenStack server.

    • connectorIaasURL: Connector-iaas is a service embedded in the Scheduler used to interact with IaaS like OpenStack. By default it runs on the following URL rmHostname/connector-iaas.

    • image: Defines which image will be used to create the Openstack instance. The value to provide is the Openstack region together with the unique image Id, for example: RegionOne/74oi56bff32ccc.

    • flavor: Defines the size of the instance. For more information see OpenStack flavors.

    • publicKeyName: Defines the name of the public key to use for a remote connection when the instance is created.

      In order to use publicKeyName, the key pair needs to be created and imported first on the openstack server. For more information see OpenStack key pair management.
    • numberOfInstances: Total number of OpenStack instances to create for this infrastructure.

    • numberOfNodesPerInstance: Total number of Proactive Nodes to deploy in each Openstack created instance.

    If all the nodes of an Openstack instance are removed, the instance will be terminated.

    +

    • downloadCommand: The command to download the Proactive node.jar. This command is executed in all the newly created OpenStack instances. The full URL path of the node.jar to download needs to be accessible from the OpenStack cloud.

    • additionalProperties: Additional Java command properties to be added when starting each ProActive node JVM in the OpenStack instances (e.g. \"-Dpropertyname=propertyvalue\").

Using this configuration, you can start a Resource Manager and a Scheduler using the /bin/proactive-server script. An OpenStack NodeSource can now be added using the Create Node Source panel in the Resource Manager or the command line interface:

$ PROACTIVE_HOME/bin/proactive-client --createns openstack -infrastructure org.ow2.proactive.resourcemanager.nodesource.infrastructure.OpenstackInfrastructure username password endpoint rmHostname connectorIaasURL image flavor publicKeyName numberOfInstances numberOfNodesPerInstance downloadCommand additionalProperties
When the Scheduler is stopped (without removing the created infrastructure), the OpenStack instances will not be terminated. And when the Scheduler is restarted, the infrastrucutre will be re-configured as per previous settings. If the instances were delete at the Openstack server side, they will be re-created and re-configured.
VmWare Infrastructure

To use the VmWare instances as computing nodes, a specific VmWare Infrastructure can be created using the Resource Manager. This section describes briefly how to make it.

  1. First, you need to have an admin account on your VmWare server.For more information see VmWare users.

  2. Use the login and password information to fill in the properties vmware_username, vmware_password in the Create Node Source panel located in the Resource Manager. Those two parameters should never change, except if you need for some reason to handle multiple VmWare accounts. Other properties in the Create Node Source are:

    • endpoint: The hostname or the IP address of the VmWare server. This address needs to be accessible from the Resource Manager.

    • rmHostname: The hostname or the public IP address of the Resource Manager. This address needs to be accessible from the VmWare server.

    • connectorIaasURL: Connector-iaas is a service embedded in the Scheduler used to interact with IaaS like VmWare. By default it runs on the following URL rmHostname/connector-iaas.

    • image: Defines which image will be used to create the VmWare instance. The value to provide is the VmWare folder together with the unique image Id, for example: ActiveEon/ubuntu.

    • minRam: The minimum required amount of RAM expressed in Mega Bytes for each VmWare instance that needs to be created.

    • minCores: The minimum required amount of virtual cores for each VmWare instance that needs to be created.

      If the combination between RAM and CORES does not match any existing VmWare instance type, then the closest to the specified parameters will be selected.
    • vmUsername: Defines the username to log in the instance when it is created.

    • vmPassword: Defines the password to log in the instance when it is created.

      The username and password are related to the image.
    • numberOfInstances: Total number of VmWare instances to create for this infrastructure.

    • numberOfNodesPerInstance: Total number of Proactive Nodes to deploy in each VmWare created instance.

    If all the nodes of an VmWare instance are removed, the instance will be terminated.

    +

    • downloadCommand: The command to download the Proactive node.jar. This command is executed in all the newly created VmWare instances. The full URL path of the node.jar to download, needs to be accessible from the VmWare cloud.

    • additionalProperties: Additional Java command properties to be added when starting each ProActive node JVM in the VmWare instances (e.g. \"-Dpropertyname=propertyvalue\").

Using this configuration, you can start a Resource Manager and a Scheduler using the /bin/proactive-server script. An VmWare NodeSource can now be added using the Create Node Source panel in the Resource Manager or the command line interface:

$ PROACTIVE_HOME/bin/proactive-client --createns vmware -infrastructure org.ow2.proactive.resourcemanager.nodesource.infrastructure.VmwareInfrastructure username password endpoint rmHostname connectorIaasURL image ram cores vmusername vmpassword numberOfInstances numberOfNodesPerInstance downloadCommand additionalProperties
When the Scheduler is stopped (without removing the created infrastructure), the VmWare instances will not be terminated. And when the Scheduler is restarted, the infrastrucutre will be re-configured as per previous settings. If the instances were delete at the VmWare server side, they will be re-created and re-configured.
Load Sharing Facility (LSF) infrastructure

This infrastructure knows how to acquire nodes from LSF by submitting a corresponding job. It will be submitted through SSH from the RM to the LSF server.

$ PROACTIVE_HOME/bin/proactive-client --createns lsf -infrastructure org.ow2.proactive.resourcemanager.nodesource.infrastructure.LSFInfrastructure rmURL javaPath SSHOptions schedulingPath javaOptions maxNodes nodeTimeout LSFServer RMCredentialsPath bsubOptions

where:

  • RMURL - URL of the Resource Manager from the LSF nodes point of view - this is the URL the nodes will try to lookup when attempting to register to the RM after their creation.

  • javaPath - path to the java executable on the remote hosts (ie the LSF slaves).

  • SSH Options - Options you can pass to the SSHClient executable ( -l inria to specify the user for instance )

  • schedulingPath - path to the Scheduling/RM installation directory on the remote hosts.

  • javaOptions - Java options appended to the command used to start the node on the remote host.

  • maxNodes - maximum number of nodes this infrastructure can simultaneously hold from the LSF server. That is useful considering that LSF does not provide a mechanism to evaluate the number of currently available or idle cores on the cluster. This can result to asking more resources than physically available, and waiting for the resources to come up for a very long time as the request would be queued until satisfiable.

  • Node Time Out - The length in ms after which one a node is not expected anymore.

  • Server Name - URL of the LSF server, which is responsible for acquiring LSF nodes. This server will be contacted by the Resource Manager through an SSH connection.

  • RM Credentials Path - Encrypted credentials file, as created by the create-cred[.bat] utility. These credentials will be used by the nodes to authenticate on the Resource Manager.

  • Submit Job Opt - Options for the bsub command client when acquiring nodes on the LSF master. Default value should be enough in most cases, if not, refer to the documentation of the LSF cluster.

Portable Batch System (PBS) infrastructure

This infrastructure knows how to acquire nodes from PBS (i.e. Torque) by submitting a corresponding job. It will be submitted through SSH from the RM to the PBS server.

$ PROACTIVE_HOME/bin/proactive-client --createns pbs -infrastructure org.ow2.proactive.resourcemanager.nodesource.infrastructure.PBSInfrastructure rmURL javaPath SSHOptions schedulingPath javaOptions maxNodes nodeTimeout PBSServer RMCredentialsPath qsubOptions

where:

  • RMURL - URL of the Resource Manager from the PBS nodes point of view - this is the URL the nodes will try to lookup when attempting to register to the RM after their creation.

  • javaPath - path to the java executable on the remote hosts (ie the PBS slaves).

  • SSH Options - Options you can pass to the SSHClient executable ( -l inria to specify the user for instance )

  • schedulingPath - path to the Scheduling/RM installation directory on the remote hosts.

  • javaOptions - Java options appended to the command used to start the node on the remote host.

  • maxNodes - maximum number of nodes this infrastructure can simultaneously hold from the PBS server. That is useful considering that PBS does not provide a mechanism to evaluate the number of currently available or idle cores on the cluster. This can result to asking more resources than physically available, and waiting for the resources to come up for a very long time as the request would be queued until satisfiable.

  • Node Time Out - The length in ms after which one a node is not expected anymore.

  • Server Name - URL of the PBS server, which is responsible for acquiring PBS nodes. This server will be contacted by the Resource Manager through an SSH connection.

  • RM Credentials Path - Encrypted credentials file, as created by the create-cred[.bat] utility. These credentials will be used by the nodes to authenticate on the Resource Manager.

  • Submit Job Opt - Options for the qsub command client when acquiring nodes on the PBS master. Default value should be enough in most cases, if not, refer to the documentation of the PBS cluster.

Generic Batch Job infrastructure

Generic Batch Job infrastructure provides users with the capability to add the support of new batch job scheduler by providing a class extending org.ow2.proactive.resourcemanager.nodesource.infrastructure.BatchJobInfrastructure. Once you have written that implementation, you can create a node source which makes usage of this infrastructure by running the following command:

$ PROACTIVE_HOME/bin/proactive-client --createns pbs -infrastructure org.ow2.proactive.resourcemanager.nodesource.infrastructure.GenericBatchJobInfrastructure rmURL javaPath SSHOptions schedulingPath javaOptions maxNodes nodeTimeout BatchJobServer RMCredentialsPath subOptions implementationName implementationPath

where:

  • RMURL - URL of the Resource Manager from the batch job scheduler nodes point of view - this is the URL the nodes will try to lookup when attempting to register to the RM after their creation.

  • javaPath - path to the java executable on the remote hosts (ie the slaves of the batch job scheduler).

  • SSH Options - Options you can pass to the SSHClient executable ( -l inria to specify the user for instance )

  • schedulingPath - path to the Scheduling/RM installation directory on the remote hosts.

  • javaOptions - Java options appended to the command used to start the node on the remote host.

  • maxNodes - maximum number of nodes this infrastructure can simultaneously hold from the batch job scheduler server.

  • Node Time Out - The length in ms after which one a node is not expected anymore.

  • Server Name - URL of the batch job scheduler server, which is responsible for acquiring nodes. This server will be contacted by the Resource Manager through an SSH connection.

  • RM Credentials Path - Encrypted credentials file, as created by the create-cred[.bat] utility. These credentials will be used by the nodes to authenticate on the Resource Manager.

  • Submit Job Opt - Options for the submit command client when acquiring nodes on the batch job scheduler master.

  • implementationName - Fully qualified name of the implementation of org.ow2.proactive.resourcemanager.nodesource.infrastructure.BatchJobInfrastructure provided by the end user.

  • implementationPath - The absolute path of the implementation of org.ow2.proactive.resourcemanager.nodesource.infrastructure.BatchJobInfrastructure.

16.6.2. Node Source Policy

Node source policy is a set of rules and conditions which describes when and how many nodes have to be acquired or released. Policies use node source API to manage the node acquisition.

Node sources were designed in a way that:

  • All logic related to node acquisition is encapsulated in the infrastructure manager.

  • Conditions and rules of node acquisition is described in the node source policy.

  • Permissions to the node source. Each policy has two parameters:

    • nodeUsers - utilization permission defined who can get nodes for computations from this node source. It has to take one of the following values:

      • ME - Only the node source creator

      • users=user1,user2;groups=group1,group2;tokens=t1,t2 - Only specific users, groups or tokens. I.e. users=user1 - node access is limited to user1; users=user1;groups=group1 - node access is limited to user1 and all users from group group1; users=user1;tokens=t1 - node access is limited to user1 or anyone who specified token t1. If node access is protected by a token, node will not be found by the resource manager (getNodes request) unless the corresponding token is specified.

      • ALL - Everybody

    • nodeProviders - Provider permission defines who can add nodes to this node source. It should take one of the following values:

      • ME - Only the node source creator

      • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

      • ALL - Everybody

  • The user created the node source is the administrator of this node source. It can add and removed nodes to it, remove the node source itself, but cannot use nodes if usage policy is set to PROVIDER or PROVIDER_GROUPS (unless it’s granted AllPermissions).

  • New infrastructure manager or node source policy can be dynamically plugged into the Resource Manager. In order to do that, it is just required to add new implemented classes in the class path and update corresponding list in the configuration file (PROACTIVE_HOME/config/rm/nodesource).

Static Policy

Static node source policy starts node acquisition when nodes are added to the node source and never removes them. Nevertheless, nodes can be removed by user request. To use this policy, you have to precise the following parameters:

  • nodeUsers - utilization permission defined who can get nodes for computations from this node source. It has to take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • nodeProviders - Provider permission defines who can add nodes to this node source. It should take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

Time Slot Policy

Time slot policy is aimed to acquire nodes for particular time with an ability to do it periodically. To use this policy, you have to precise the following parameters:

  • nodeUsers - utilization permission defined who can get nodes for computations from this node source. It has to take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • nodeProviders - Provider permission defines who can add nodes to this node source. It should take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • acquireTime - Absolute acquire date (e.g. "6/3/10 1:18:45 PM CEST").

  • releaseTime - Absolute releasing date (e.g. "6/3/10 2:18:45 PM CEST").

  • period - period time in millisecond (default is 86400000).

  • preemptive - Preemptive parameter indicates the way of releasing nodes. If it is true, nodes will be released without waiting the end of jobs running on (default is false).

Cron Policy

Cron policy is aimed to acquire and remove nodes at specific time defined in the cron syntax.To use this policy, you have to precise the following parameters:

  • nodeUsers - utilization permission defined who can get nodes for computations from this node source. It has to take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • nodeProviders - Provider permission defines who can add nodes to this node source. It should take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • nodeAcquision - The time policy will trigger the deployment of all nodes (e.g. "0 12 \* \* \*" every day at 12.00).

  • nodeRemoval - The time policy will trigger the removal of all nodes (e.g. "0 13 \* \* \*" every day at 13.00).

  • preemptive - Preemptive parameter indicates the way of releasing nodes. If it is true, nodes will be released without waiting the end of jobs running on (default is false).

  • forceDeployment - If for the example above (the deployment starts every day at 12.00 and the removal starts at 13.00) you are creating the node source at 12.30 the next deployment will take place the next day. If you’d like to force the immediate deployment set this parameter to true.

Remove Nodes When Scheduler Is Idle

"Remove nodes when scheduler is idle" policy removes all nodes from the infrastructure when the scheduler is idle and acquires them when a new job is submitted. This policy may be useful if there is no need to keep nodes alive permanently. Nodes will be released after a specified "idle time". This policy will use a listener of the scheduler, that is why its URL, its user name and its password have to be specified. To use this policy, you have to precise the following parameters:

  • nodeUsers - utilization permission defined who can get nodes for computations from this node source. It has to take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • nodeProviders - Provider permission defines who can add nodes to this node source. It should take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • schedulerURL - URL of the Scheduler

  • schedulerCredentialsPath - Path to the credentials used for scheduler authentication.

  • idleTime - idle time in millisecond to wait before removing all nodes (default is 60000).

Scheduler Loading Policy

Scheduler loading policy acquires/releases nodes according to the scheduler loading factor. This policy allows to configure the number of resources which will be always enough for the scheduler. Nodes are acquired and released according to scheduler loading factor which is a number of tasks per node.

It is important to correctly configure maximum and minimum nodes that this policy will try to hold. Maximum number should not be greater than potential nodes number which is possible to deploy to underlying infrastructure. If there are more currently acquired nodes than necessary, policy will release them one by one after having waited for a "release period" delay. This smooth release procedure is implemented because deployment time is greater than the release one. Thus, this waiting time deters policy from spending all its time trying to deploy nodes.

To use this policy, you have to precise the following parameters:

  • nodeUsers - utilization permission defined who can get nodes for computations from this node source. It has to take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • nodeProviders - Provider permission defines who can add nodes to this node source. It should take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • schedulerURL - URL of the Scheduler

  • schedulerCredentialsPath - Path to the credentials used for scheduler authentication.

  • refreshTime - time between each calculation of the number of needed nodes.

  • minNodes - Minimum number of nodes to deploy

  • maxNodes - Maximum number of nodes to deploy

  • loadFactor - number of tasks per node. Actually, if this number is N, it does not means that there will be exactly N tasks executed on each node. This factor is just used to compute the total number of nodes. For instance, let us assume that this factor is 3 and that we schedule 100 tasks. In that case, we will have 34 (= upper bound of 100/3) started nodes. Once one task finished and the refresh time passed, one node will be removed since 99 divided by 3 is 33. When there will remain 96 tasks (assuming that no other tasks are scheduled meanwhile), an other node will be removed at the next calculation time, and so on and so forth…​

  • nodeDeploymentTimeout - The node deployment timeout.

Cron Load Based Policy

The Cron load based policy triggers new nodes acquisition when scheduler is overloaded (exactly like with "Scheduler loading" policy) only within a time slot defined using crontab syntax. All other time the nodes are removed from the resource manager. To use this policy, you have to precise the following parameters:

  • nodeUsers - utilization permission defined who can get nodes for computations from this node source. It has to take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • nodeProviders - Provider permission defines who can add nodes to this node source. It should take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • schedulerURL - URL of the Scheduler

  • schedulerCredentialsPath - Path to the credentials used for scheduler authentication.

  • refreshTime - time between each calculation of the number of needed nodes.

  • minNodes - Minimum number of nodes to deploy

  • maxNodes - Maximum number of nodes to deploy

  • loadFactor - number of tasks per node. Actually, if this number is N, it does not means that there will be exactly N tasks executed on each node. This factor is just used to compute the total number of nodes. For instance, let us assume that this factor is 3 and that we schedule 100 tasks. In that case, we will have 34 (= upper bound of 100/3) started nodes. Once one task finished and the refresh time passed, one node will be removed since 99 divided by 3 is 33. When there will remain 96 tasks (assuming that no other tasks are scheduled meanwhile), an other node will be removed at the next calculation time, and so on and so forth…​

  • nodeDeploymentTimeout - The node deployment timeout.

  • acquisionAllowed - The time when the policy starts to work as the "scheduler loading" policy (e.g. "0 12 \* \* \*" every day at 12.00).

  • acquisionForbidden - The time policy removes all the nodes from the resource manager (e.g. "0 13 \* \* \*" every day at 13.00).

  • preemptive - Preemptive parameter indicates the way of releasing nodes. If it is true, nodes will be released without waiting the end of jobs running on (default is false).

  • allowed - If true acquisition will be immediately allowed.

Cron Slot Load Based Policy

The "Cron slot load based" policy triggers new nodes acquisition when scheduler is overloaded (exactly like with "Scheduler loading" policy) only within a time slot defined using crontab syntax. The other time it holds all the available nodes. To use this policy, you have to precise the following parameters:

  • nodeUsers - utilization permission defined who can get nodes for computations from this node source. It has to take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • nodeProviders - Provider permission defines who can add nodes to this node source. It should take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • schedulerURL - URL of the Scheduler

  • schedulerCredentialsPath - Path to the credentials used for scheduler authentication.

  • refreshTime - time between each calculation of the number of needed nodes.

  • minNodes - Minimum number of nodes to deploy

  • maxNodes - Maximum number of nodes to deploy

  • loadFactor - number of tasks per node. Actually, if this number is N, it does not means that there will be exactly N tasks executed on each node. This factor is just used to compute the total number of nodes. For instance, let us assume that this factor is 3 and that we schedule 100 tasks. In that case, we will have 34 (= upper bound of 100/3) started nodes. Once one task finished and the refresh time passed, one node will be removed since 99 divided by 3 is 33. When there will remain 96 tasks (assuming that no other tasks are scheduled meanwhile), an other node will be removed at the next calculation time, and so on and so forth…​

  • nodeDeploymentTimeout - The node deployment timeout.

  • acquisionAllowed - The time when the policy starts to work as the "scheduler loading" policy (e.g. "0 12 \* \* \*" every day at 12.00).

  • acquisionForbidden - The time policy removes all the nodes from the resource manager (e.g. "0 13 \* \* \*" every day at 13.00).

  • preemptive - Preemptive parameter indicates the way of releasing nodes. If it is true, nodes will be released without waiting the end of jobs running on (default is false).

  • allowed - If true acquisition will be immediately allowed.

EC2 Policy

Allocates resources according to the Scheduler loading factor, releases resources considering that EC2 instances are paid by the hour. To use this policy, you have to precise the following parameters:

  • nodeUsers - utilization permission defined who can get nodes for computations from this node source. It has to take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • nodeProviders - Provider permission defines who can add nodes to this node source. It should take one of the following values:

    • ME - Only the node source creator

    • users=user1,user2;groups=group1,group2 - Only specific users or groups (for our example user1, user2, group1 and group2). It is possible to specify only groups or only users.

    • ALL - Everybody

  • schedulerURL - URL of the Scheduler

  • schedulerCredentialsPath - Path to the credentials used for scheduler authentication.

  • preemptive - Preemptive parameter indicates the way of releasing nodes. If it is true, nodes will be released without waiting the end of jobs running on (default is false).

  • refreshTime - time between each calculation of the number of needed nodes.

  • loadFactor - number of tasks per node. Actually, if this number is N, it does not means that there will be exactly N tasks executed on each node. This factor is just used to compute the total number of nodes. For instance, let us assume that this factor is 3 and that we schedule 100 tasks. In that case, we will have 34 (= upper bound of 100/3) started nodes. Once one task finished and the refresh time passed, one node will be removed since 99 divided by 3 is 33. When there will remain 96 tasks (assuming that no other tasks are scheduled meanwhile), an other node will be removed at the next calculation time, and so on and so forth…​

  • releaseDelay - Delay between each node release. This time is useful since the deploying time is important. Let us assume that a node has to be removed. If this releaseDelay did not exist (or if it was set to 0), this node would be removed instantaneously. Let us assume assume that right after this removal, another task is scheduled, requiring a new node. In that case, we would lose a lot of time removing the previous node and deploying another one whereas the task could have been scheduled on the same node. This releaseDelay therefore represents the time to wait before effectively removing a node.

16.7. Command Line

The ProActive client allows to interact with the Scheduler and Resource Manager. The client has an interactive mode started if you do not provided any command.

The client usage is also available using the -h parameter as shown below:

$ PROACTIVE_HOME/bin/proactive-client -h

Parameter

Description

-a,--addnodes <node-url>

Add the specified nodes by their URLs

-c,--credentials <cred-path>

Path to the credential file

-ca,--cacerts <store-path>

CA certificate store (JKS type) to verify peer against (SSL)

-cap,--cacertspass <store-password>

Password for CA certificate store

-cn,--createns <node-source>

Create new node source

-d,--removenodes <node-url>

Remove the specified node

-df,--downloadfile <space-name path-name local-file>

Download the specified file from the server and stores it locally.

-f,--force

Do not wait for buys nodes to be freed before removal

-freeze,--freezescheduler

Freeze the Scheduler (pause all non-running jobs)

-h,--help

Prints the usage of REST command-line client for Resource Manager and Scheduler

-in,--infrastructure <param1 param2 …​>

Specify an infrastructure for the node source

-jo,--joboutput <job-id>

Retrieve the output of specified job

-jp,--jobpriority <job-id priority>

Change the priority of the specified job

-jr,--jobresult <job-id>

Retrieve the result of specified job

-js,--jobstate <job-id>

Retrieve the current state of specified job

-k,--insecure

Allow connections to SSL sites without certs verification

-kill,--killscheduler

Kill the Scheduler

-kj,--killjob <job-id>

Kill the specfied job

-l,--login <login-name>

the login name to connect to REST server

-lc,--list-credentials

List third-party credential keys stored in the Scheduler

-li,--listinfra

List supported infrastructure types

-lj,--listjobs

Retrieve a list of all jobs managed by the Scheduler

-ll,--livelog <job-id>

Retrieve the live output of specified job

-ln,--listnodes

List nodes handled by the Resource Manager

-lns,--listns

List node-sources handled by the Resource Manager

-lon,--locknodes <node1-url node2-url …​>

Lock specified nodes

-lp,--listpolicy

List supported policy types

-lrm,--linkrm <rm-url>

Reconnect a Resource Manager to the Scheduler

-ni,--nodeinfo <node-url>

Retrieve info of specified node

-ns,--nodesource <node-source>

Specify node source name

-o,--output-file <arg>

Output the result of command execution to specified file

-p,--password <password>

the password to connect to REST server

-pause,--pausescheduler

Pause the Scheduler (pause all non-running jobs)

-pc,--put-credential <key value>

Store a third-party credential <key-value> pair in the Scheduler

-pj,--pausejob <job-id>

Pause the specified job (pause all non-running tasks)

-po,--policy <param1 param2 …​>

Specify a policy for the node source

-pr,--print

Print the session id

-pt,--preempttask <job-id task-id [delay]>

Stop the specified task and re-schedules it after the specified delay

-r,--removens <node-source>

Remove specified node source

-rc,--remove-credential <key>

Remove a third-party credential corresponding to the key from the Scheduler

-resume,--resumescheduler

Resume the Scheduler

-rh,--rmhelp

Prints the usage of REST command-line client for Resource Manager

-rj,--resumejob <job-id>

Resume the specified job (restart all 'paused' tasks)

-rmj,--removejob <job-id>

Remove the specified job

-rmstats,--rmstats

Retrieve current Resource Manager statistics

-rt,--restarttask <job-id task-name>

Restart the specified task after the specified delay

-s,--submit <job-descriptor [var1=value1 var2=value2 …​]>

Submit the specified job-description (XML) file

-sa,--submitarchive <job-archive [var1=value1 var2=value2 …​]>

Submit the specified job archive (JAR or ZIP) file

-sf,--script <script-path [param-1=value-1 param-2=value-2 …​]>

Evaluate the specified JavaScript file

-sh,--schedulerhelp

Prints the usage of REST command-line client for Scheduler

-si,--session-id <session-id>

the session id of this session

-sif,--session-id-file <session-file>

the session id file for this session

-sstats,--schedulerstats

Retrieve current Scheduler statistics

-start,--startscheduler

Start the Scheduler

-stop,--stopscheduler

Stop the Scheduler

-t,--topology

Retrieve node topology

-to,--taskoutput <job-id task-name>

Retrieve the output of specified task

-tr,--taskresult <job-id task-name>

Retrieve the result of specified task

-u,--url <server-url>

URL of REST server

-uf,--uploadfile <space-name file-path file-name local-file>

Upload a file to the specified location of the server

-ulon,--unlocknodes <node1-url node2-url …​>

Unlock specified nodes

-X,--debug

Print error messages (includes normal stack trace)

-z,--silent

Runs the command-line client in the silent mode

16.7.1. Command Line Examples

Deploy ProActive Nodes
In non-interactive mode
$ PROACTIVE_HOME/bin/proactive-client -cn 'moreLocalNodes' -infrastructure 'org.ow2.proactive.resourcemanager.nodesource.infrastructure.LocalInfrastructure' './config/authentication/rm.cred'  4 60000 '' -policy org.ow2.proactive.resourcemanager.nodesource.policy.StaticPolicy 'ALL' 'ALL'
In interactive mode
$ PROACTIVE_HOME/bin/proactive-client
> createns( 'moreLocalNodes', ['org.ow2.proactive.resourcemanager.nodesource.infrastructure.LocalInfrastructure', './config/authentication/rm.cred', 4, 60000, ''], ['org.ow2.proactive.resourcemanager.nodesource.policy.StaticPolicy', 'ALL', 'ALL'])

The Activeeon team © 2016 by Activeeon

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation; version 3 of the License. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details. You should have received a copy of the GNU Affero General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA If needed, contact us to obtain a release under GPL Version 2 or 3 or a different license than the AGPL.