The DataTypes element
The DataTypes element is the document element for the DataType xml file. It serves as a container for all the individual DataTypes.
The DataType element
<DataType name = string sqltype = list manualStringSearch = "(false|true|optional)" display = format pattern> </DataType>
Describes a single data type. The DataTypes element can contain zero or more DataType elements.
- name (required)
- The name for the data type. This is used both as a display string as well as the identifier for the data type.
- sqltype (optional)
- White space separated list of SQL type identifiers that are recognized to be of this type. No type identifier may occur on more than DataType element.
- manualStringSearch (optional)
A value indicating which search strategy to use when dealing with string values
in the database. Three values are possible:
false: (default value) The pattern attributes of the Format elements are used to generate SQL SELECT queries to search with.
true: All relevant string fields are manually parsed and compared based on MatchNames.
optional: A SQL SELECT search is attempted first, and the user is presented with the option to do a MatchName based search if (s)he hasn't found what (s)he was looking for.
See the documentation on search strategies for more information.
- display (optional)
Defines a display format for the DataType. This display format should
give the value of the parsed string in an unambiguous way. If present, it will be
used in the case of an ambiguous match to show the user alternatives to pick from.
If not present, the description attribute of the Formats
will be used.
See the format pattern reference for more infomation.
The Format element
<Format regex = regular_expression pattern = format_pattern description = string> </Format>
Describes one of the string formats that make up the data type. A DataType can have one or more Formats.
- regex (required)
- The regular expression that defines how to recognize this format. All strings that match this regular expression will be marked to be of the DataType that this format belongs to. Use parentheses to mark subexpressions for use with the MatchName element. Format regular expressions must conform to the Boost regular expression syntax.
- pattern (optional)
A format pattern that describes how to generate a string from the parts defined
in the MatchName element. These strings are used for cross-format conversion
when generating SQL statements. When the manualStringSearch attribute
on the DataType element is false or optional,
the pattern attribute is required.
See the format pattern reference for more infomation
- description (required)
- This attribute serves a two-fold purpose. It serves as a description to the Xml file maintainer, but it will also be shown to the user to differentiate formats in the situation where a string matches more than one format with different MatchName values (an ambiguous match).
The MatchName element
<MatchName match = number name = identifier type = (string|integer|decimal) parseOptions = string> </MatchName>
Assigns a name to a subexpression marked with parentheses in the Format's
regular expression. A Format can have one or more MatchNames.
All of the DataType's Formats must define the same MatchNames with the same datatype,
and in such a way that they are semantically equivalent and can be compared.
The MatchName element makes it possible to compare two instances of the same DataType that do not match the same Format.
- match (required)
- The number of the match subexpression being referred to. In accordance with the behaviour of Boost regular expressions, "0" refers to the entire expression, and "1" and up to the individual marked subexpressions. Refer to the Boost regular expression syntax for more details.
- name (required)
- Defines the name of this marked subexpression. Between two instances of the same DataType, MatchNames with the same name will be compared. Refer to the Search Strategies documentation for more information.
- type (required)
The datatype of the marked subexpression. The string that is matched must be of
a type that can be parsed into this datatype. Refer to the MatchName datatype documentation
for more details.
Three values are supported:
string: a literal character string.
integer: an integral number.
decimal: a real or integral number with exact precision.
- parseOptions (optional)
- Options that are used to parse the matched string into the indicated type. Refer to the MatchName datatype documentation for more details.
The if element
<if type = (integer|string) test = boolean_expression operation = (arithmetic_expression|constant)> </if>
Describes a simple conditional operation to be performed on the MatchName
value. A MatchName can contain zero or more if elements.
If it contains none, or the value of the matched expression doesn't match any of
the conditions specified, the value is assigned to the MatchName unmodified.
Only type="integer" and type="string" support conditionals.
The conditionals are processed in the order they are present in the XML file, and are each applied to the original (unmodified) value. The last one to match will determine the actual value assigned to the MatchName. The if element has string contents that define the operation to be performed. This string must have the following format:
- type (optional)
The type the match value must be treated as in the conditional expression. If ommitted
it is assumed to be the same type as the MatchName. Having the type different allows
you to do something like this:
<MatchName match="2" name="month" type="integer">
<if type="string" test="value = 'January'" operation="1" />
- test (required)
The conditional expression. It must have the form:
value <conditional operator> <numeral>
The keyword value refers to the original parsed value of the matched expression, and <conditional operator> is one of '=', '!=', '<', '>', '<=' or '>='. It can also be either the boolean constant "true" or "false" (without quotes). In this case the operation is performed always or never respectively.
- operation (required)
Defines the operation to be performed if the conditional expression evaluates true.
It can be an arithmetic expression of the following format:
value <operator> <numeral>
<numeral> <operator> value
In this string, the keyword value refers to the parsed value of the subexpression, and <operator> is one of '+', '-', '*' or '/' (addition, subtraction, multiplication and division respectively).
The expression can also be a literal, which depending on the type of the MatchName is either a single-quote (') or double quote (") enclosed string literal or a numeric literal.
Note that quotes inside a string literal do not need to be escaped.
Only integer MatchNames support arithmetic expressions. For a string MatchName, this must be a literal.
The Condition element
<Condition test = boolean_expression> </Condition>
Allows conditional matching. If a DataType has a Condition
or ConditionGroup, a string will only match if it matches one of the
Formats, but also if it meets the condition(s). A DataType can have
only one Condition. To create multiple conditions grouped together
by and and or operators, use the ConditionGroup
- test (required)
The condition to test for. It has the following format:
(<MatchName>|<numeral>) <conditional operator> (<MatchName>|<numeral>)
<MatchName> must refer to one of the names defined using the MatchName elements, and <conditional operator> is one of '=', '!=', '<', '>', '<=' or '>='.
Conditions can only be applied to MatchNames with type="integer".
The ConditionGroup element
<ConditionGroup operator = (and|or)> </ConditionGroup>
Allows conditional matching on multiple conditions. Using the ConditionGroup element, you can group together Conditions using the and and or operators. Because a ConditionGroup can contain other ConditionGroup elements, is is possible to create complex expressions.
- operator (required)
- Defines the boolean operator by which the contained conditions are grouped.
The CompatibleType element
<CompatibleType type = string> </CompatibleType>
A DataType can contain zero or more CompatibleType elements to indicate other DataTypes with which it can be compared. If the CompatibleMatchName elements refer to all the MatchNames of the target type, the manualStringSearch attribute on the target type determines how to comparison is done. If it specifies only some, a manual (MatchName based) search is always done. (TODO: partial compatible types depend on partial matches, and may not be implemented depending on the available time)
- type (required)
- The type with which this type is compatible. It must be the name of another DataType in the xml document.
The CompatibleMatchName element
<CompatibleMatchName name = string compatibleWith = string> </CompatibleMatchName>
Determines which source MatchName is compatible with which target MatchName. The two must be of the same type.
- name (required)
- The referred MatchName on this DataType.
- compatibleWith (required)
- The referred MatchName on the target DataType that the MatchName referred to by name is compatible with.