Input Format "Xml"
This input format supports splitting an XML file representing a batch of documents into the individual documents, and to extract or add related file attachments to the XML file.
Property | Description |
|---|---|
InputFormat[].DocNode | Definition of the node in an XML file, based on which this file will be split into separate document-specific parts (optional) By default, an XML file is assumed to contain one single document. When splitting, the original document is discarded and a copy of the original document is generated instead for each new partial document. The document-specific part of the original XML file is added to the copy as another attachment. The split documents generated are given the name suffix The syntax to use for specifying the property is XPath, which is the same syntax used for extracting index data (see XML index data reader). For example, if the XML file contains multiple <Root> <Document>…</Document> <Document>…</Document> </Root> |
InputFormat[].AttachFileMode | Mode determining how additional file attachments are extracted from or added to the XML file:
The |
InputFormat[].AttachFileNode(*) | This property applies to the Subparameters in XPath syntax for defining the XML nodes that contain file attachment data:
Parameters 2 and 3 only apply to the Further subparameters of freely selectable names are permitted. These parameters are added to the file attachments as metadata. |
InputFormat[].AttachFileRefMacro(*) | Field macro expression that must return a file name or name pattern in This expression can optionally also be used in In the macro expression, you can use file variables like |
InputFormat[].RemoveXmlNs | Boolean value determining whether namespace information contained in XML files will be removed from them before parsing, to avoid related parsing problems If need be, XPath expression for referencing nodes are then to be specified without the namespace prefix. Default value: CautionIf namespaces are retained, XML documents are not readable if they use a standard namespace (without a prefix), e.g., for the root node. |