Data keys are placeholders in SQL statements, like “@d”, that are replaced with real values when executed by a client.
The naming rule is simple: “@” followed by a single word (no spaces) in trx files/SQL statements, but no “@” prefix in stage files:
Use in Trx File/SQL Statement
@d @id @last_updated
Configure in Stage File
d id last_updated
Examples in these docs use the canonical data key (@d) and other short names, but a good practice is to name data keys after the columns for which they’re used:
SELECT c FROM t WHERE id = @id AND n > @n AND @k IS NOT NULL
Data keys are configured in
stage.trx.data: each data key must have a corresponding entry in that map.
Since “@” is a special character in YAML, the data key names in a stage file do not have the “@” prefix:
stage: trx: - file: read.sql data: d: # @d generator: ing id: # @id generator: int params: max: 4294967296 last_updated: # @last_updated generator: int
The only required configuration is
generator: the name of a data generator to use for the data key.
But you most likely need to specify generator-specific parameters, as shown above for “id”.
stage.trx.data for the full configuration.
The terms “data key” and “data generator” are used interchangeably because they’re two sides of the same coin. A specific term is used when necessary to make a technical distinction.
You can reuse the same data key name in a trx file, like @id shown below.
SELECT c FROM t WHERE id = @id UPDATE t SET n=n+1 WHERE id = @id
This works because the default data scope is statement: Finch creates one data generator for @id in the
SELECT, and another for @id in the
However, there are two important points to consider:
- @id in both statements will have the same configuration because they’ll be the same map key (“id”) in
stage.trx.data—same name, same configuration.
- Data scope determines if the two @id are different or the same, and “when”: if they’re different, when do they generate different values?
Be sure to read Data / Scope to fully understand the second point.
The second point—data scope—is important because, in that trx, it looks like the two statements are supposed to access the same row (read the row, then update it). If that’s true, then the default statement data scope won’t do what you want. Instead, you need to make @id the same in both statements by giving it trx scope:
stage: trx: - file: read.sql data: id: scope: trx # <-- data scope generator: int
Now the generator will return the same value for @id wherever @id is used in the trx.
Meta data keys don’t directly generate a value but serve another purpose as documented. You do not configure meta data keys in a stage file.
@PREV refers to the previous data key.
It’s required for ranges:
BETWEEN @d AND @PREV.
stage: trx: - file: read.sql data: d: generator: int-range
The int-range generator returns two values (an ordered pair): the first value repalced @d, the second value replaces @PREV.