-----Problem-----
An EAC task fails to start. The invoke log ($ENDECA_CONF/logs/invoke.0.log) on the central server shows an exception message along these lines:

SEVERE: 'Start shell' request encountered an error for application 'testapp'
com.endeca.esf.shared.EsfException: The task named 'wget_-http-127-0-0-1-15000-admin-op-update-warmupseconds-20-' in application 'testapp' already exists on machine 'mdexhost01.dev.domain.com'.


-----Cause-----
This error indicates that the two or more tasks with the same name are being started at the same time, causing a collision in the EAC's task namespace.

For two or more tasks to run concurrently, their names (also called tokens or task IDs) must be unique across the entire application. It's possible to define tasks with the same name, but such tasks cannot run at the same time: the first task invoked with a given name will start normally, but the second will fail with the exception shown above.

By default, the Deployment Template creates tasks with non-unique names, such as the wget/admin-update shell utility instance shown in the example above, which has the same name on each MDEX host server. Since these update jobs are run sequentially, there is no namespace collision: only one instance of the "wget_-http-127-0-0-1-15000-admin-op-update-warmupseconds-20-" task ever runs at a given time.

If users extend the DT to include logic which runs these update jobs in parallel across all MDEX instances, however, the namespace-collision exception will occur for task instances after the first: the shell utility will start on the first MDEX server, but will fail on the second server with an exception message referencing the existing task on the first server.


-----Solution-----
The best solution is to ensure that each task has a unique name, so that no namespace collision occurs. For the wget/admin-update job shown above, for instance, using the local hostname rather than the "127.0.0.1" IP address would result in a unique task name for each MDEX host server: this would yield tasks named (e.g.) "wget_-http-mdexhost01-dev-domain-com-15000-admin-op-update-warmupseconds-20-" rather than "get_-http-127-0-0-1-15000-admin-op-update-warmupseconds-20-", which would allow the updates to be run concurrently on multiple hosts.

If tasks must share a name, ensure that they do not run concurrently so that only one instance of a task name is ever in use at a time.
젠장 어떻게 하라는건지.. 설명은 간단하게 되어 있다만 모르겠다.. -_-;
나중에 알게되면 다시 자세히 한글로다가.. ㅎㅎ

+ Recent posts