Skip to main content

xSuite Archive Prism User Guide

The "Scheduler" configuration node

The scheduler is a Windows service that executes tasks in the background. You can define scheduler jobs and scheduler triggers. Each job represents a specific task that is executed in the background.

The triggers contain the specific parameters that are required to execute the jobs. The parameters relate to the time of execution and the directory that is monitored for an import, among other things. The triggers are specified in the form of a cron expression.

Notice

You can view the status of the defined jobs in the task control.

Notice

The basic configuration of default jobs is automatically created in the configuration database when the scheduler service is started for the first time. Changes to the configuration take effect either immediately or the next time the job is executed in the scheduler.

MonitorImportJob

At regular intervals, the "MonitorImportJob" monitors an import directory located locally on the archive server. If new data is stored in this directory, the directory will be recursively searched for import.job files. These files contain the parameters for import. The files are checked, and the corresponding locks (import.lock) are created in the file system. The files are prepared and checked in the import database so that the "ImportWorkerJobs" can carry out the import asynchronously and in parallel.

If the "MonitorImportJob" finds new subfolders, these folders will be read in and processed. When a job is read in, the job creates an import.lock file in the folder. This file signals that the job is being processed and no longer needs to be imported. The job checks the import data and creates an import request with the status "Pending" in the import database. The job divides the data into smaller units ("chunks") for processing.

An import always consists of a job definition and chunks. The chunks contain a defined number of documents.

Parameters

The following job parameters are available:

Name

Type

Default value

Description

ChunkSize

Integer

50

Number of documents per part of an import request (chunk)

MonitoredDirectory

String

E:\Import

Monitoring folder for bulk imports

ImportWorkerJob

At regular intervals, the "ImportWorkerJob" reads the import database for imports that have not yet been processed. If new data is available, a worker will retrieve a part (chunk) of an import job, lock this chunk and import all the documents it contains. The chunks are processed according to priority and creation time.

After processing, the status of the documents and the import chunk is updated in the import database. An entry is made in the monitoring log. Documents with errors are flagged.

Notice

To process incorrect documents again, reset the chunk in the database manually.

Parameters

The following job parameters are available:

Name

Type

Default value

Description

WorkerId

String

ImportWorker001

Unique name of the "ImportWorkerJob"

ImportMaintenanceJob

The "ImportMaintenanceJob" checks the completeness of the bulk imports at regular intervals. If no errors are found, the data that is no longer required will be deleted from the file system. The import request data will then be flagged as archived.

If any errors are found, an email with an error report is sent to the email address specified in the configuration under SystemSettingsDefaultSettingsReporting.

IndexerSchedulerJob

The "IndexerSchedulerJob" performs the full-text indexing. Full-text indexing takes place asynchronously after import if it has been defined this way. When importing into the "Indexjobber" database, the archive server creates entries that are processed by the job instances. There is a temporary entry in the "Indexjobber" database for each document to be indexed.

If indexing fails, a maximum of three additional indexing attempts will be made for the document.

Notice

To start a new indexing attempt, manually remove the error counter and the document lock.

Parameters

The following job parameters are available:

Name

Type

Default value

Description

BulkSize

Number

--

The number of documents indexed asynchronously at the same time

Tenants

Text

Blank

Comma-separated list, with tenants or empty

IndexMigrationJob

The "IndexMigrationJob" migrates an Elasticsearch-2.x index to an Elasticsearch 7 index. Technically, the Elasticsearch 2 index of each archive in the tenant is migrated iteratively to an Elasticsearch 7 index. The job is repeated until all documents have been migrated to the Elasticsearch 7 index.

The archive can be used without restrictions during the migration. A temporary index is created for each index. The temporary index is deleted as soon as the migration is completed. Archives that are being migrated and archives that have already been migrated will receive the configuration parameter Migrated with the value InIndexMigration.

Once the migration is 100% complete, the job will automatically replace the Elasticsearch 2 index with the Elasticsearch 7 index. From that point on, only the Elasticsearch 7 index will be used.

Caution

When migrating nested archives, make sure that each archive has an index.

Subordinate archives of a node that do not have an index will not be migrated.

Parameters

The following job parameters are available:

Name

Type

Description

BulkSize

Number

The number of archive documents that are migrated at the same time

MaximumMigrationJobs

Number

The number of archive indexes that are migrated at the same time

Tenants

Text

Comma-separated specification of the tenants to be migrated

If the parametric value is empty, the default tenant will be migrated.

If the parameter value * is specified, all tenants will be migrated.

TransferJob

"TransferJob" searches the configured archives for documents that are older than a configured time period. If enough documents are available for a transfer (minimum container size), these documents are written to a container and transferred to the "EndArchived" status.

Prerequisite: A shard of the ContainerBox type is present in the archive that is being processed.

Parameters

The following job parameters are available:

Name

Type

Default value

Description

Archives

Text

--

Comma-separated list of archives that the "TransferJob" processes

If the field is empty, all archives will be checked.

MaxSizeMB

Number

500

Maximum size of the containers that are created (in MB)

MinSizeMB

Number

250

Minimum size of the containers that are created (in MB)

TimeSpan

Text

--

Period

Syntax: {number}{unit}

The following units are available:

  • d = days

  • m = months

  • y = years

Example: 100d for 100 days

Tenant

Text

--

Tenant

If the field is not available, the default tenant will be used.

LogArchiverSchedulerJob

The "LogArchiverSchedulerJob" archives log entries that are older than a configured value as a JSON structure in an archive document. If this is configured, full-text indexing will also be performed.

Parameters

The following job parameters are available:

Name

Type

Default value

Description

Archive

Text

/Logs

Archive to be written to

EntriesPerDoc

Number

1000

The number of log entries that are combined into one archive document

Timespan

Number

0

The number of days after which archiving takes place

Tenant

Text

Blank

Tenant or empty

RetentionJob

The "RetentionJob" searches the configured archives for documents whose standard expiry date has expired. These documents are deleted. The standard expiry date is defined via the Retention archive property or the Retention document type property.

Notice

Documents that have a legal hold are not deleted.

Parameters

The following job parameters are available:

Name

Type

Default value

Description

Archives

Text

--

Comma-separated list of archives that the "RetentionJob" processes

Tenant

Text

--

Tenant

If the field is not available, the default tenant will be used.

TempCleanup

The "TempCleanup" job cleans up the directory in which the temporary files are stored at regular intervals.

Notice

You can change the directory under SystemSettingsTempFiles to configure it.

Parameters

The following job parameters are available:

Name

Type

Default value

Description

ExpireHours

Number

24

Age of the file in hours

If a file exceeds this age, the file will be deleted.

ReplicaJob

The "ReplicaJob" transfers all documents that are to be replicated to the foreign servers (ForeignServers), executing and saving all replications to local archives.

The aim of replication is to ensure data security and that data is not lost in the case of downtime. In an archive with a replica configuration, a randomly generated value (change token) is written to the document with each write operation. If the document has been replicated correctly, the change token will also be available in the replication. The change token indicates that the master and the slave replications are the same.

If the Check replication property is activated in the archive configuration, a replication check is also carried out. A replication check can also be carried out for an archive. During the replication check, the master archive and the slave archive are compared with each other.

Parameters

The following job parameters are available:

Name

Type

Default value

Description

JobSize

Number

Infinite

Maximum number of documents that can be duplicated in one run

If no value is specified, the "ReplicaJob" will duplicate all documents.

BatchSize

Number

100

Number of documents that are replicated locally in a batch action

BulkSize

Number

100

Size of the data block (in MB)

The data is transferred to a ForeignServer.

ForeignSize

Number

10

Maximum number of documents that can be transferred to a ForeignServer in one batch

ImportConverterJob

The "ImportConverterJob" converts import files into JSON format. The converted data can be archived after conversion using a standard file import job. The conversion process does not include any content checks of the data.