Testing

Tests should be developed before rules. When developing rules, tests can be run using the command:

sbt cleanTest

Note

This command first copies the grammars to the src files before running the tests. To run the tests without updating the rules use the command sbt test.

Developing Tests

Testing is performed through two files:

TestEntities.scala (reader/src/test/scala/org/parsertongue/mr/logx/entities/TestEntities.scala) and TestEvents.scala (reader/src/test/scala/org/parsertongue/mr/logx/events/TestEvents.scala), that utilize the testing functions defined in TestUtils.scala (reader/src/test/scala/org/parsertongue/mr/TestUtils.scala).

Both TestEntities.scala and TestEvents.scala call the same boolean function, checkMention(). checkMention operates on two inputs: a sequence of Mentions (extracted by the MachineReadingSystem using the Odin rules in logx-reader), and a MentionTestCase. MentionTestCase is split into two cases: ExistsMentionTestCase, and ForAllMentionTestCase. The names of the two cases indicate the quantifier applying to evaluation of the sequence of Mentions, i.e. find at least one Mention such that..., or insure that for all Mentions ... . This different quantification is directly reflected in the evaluation procedure at the level of checkMention:

  • ExistsMentionTestCase:
// At least one mention must meet the criteria
mentions.exists { m => em.check(m) }
  • ForAllMentionTestCase:
// Every mention must meet the criteria
mentions.forall { m => em.check(m) }

All functions called by checkMention have their own check() boolean function. MentionTestCase (either variety)'s check function calls the check functions of its TextTestCase and LabelTestCase(s) obligatorily, and calls check from ArgTestCase if present. ArgTestCase, in turn, through its check function, evaluates the check functions of its own TextTestCase, LabelTestCase(s), and RoleTestCase(s).

The bottom-level functions TextTestCase, LabelTestCase, RoleTestCase each have two varieties indicated by prefix: Positive-/Negative-. As the names suggest, a PositiveTextTestCase returns true if its string argument matches the text value of the Mention submitted to its check function. A NegativeTextTestCase returns true if its string argument does NOT match the Mention's text field.

Entity and Event tests differ in that Entity tests cannot contain ArgTestCase or RoleTestCase, only TextTestCase and LabelTestCase checks.

The typical use of testing involves positive subtests embedded in an ExistsMentionTestCase, as in this example:

ExistsMentionTestCase( 
    labels = Seq(PositiveLabelTestCase("Vessel")),
    mentionSpan = PositiveTextTestCase("LNG Finima"),
    text = "LNG Finima"
)

To illustrate the potential utility of negative testing: the entity rules include patterns for dates, such as "12 Jun 2000". However, a possible confusion arises from the format of time-of-day expressions in military documents, where we might see "12 Jun 2000 hours." (i.e., 8 P.M. on June 12). Of course, the entity rules themselves have a variety of means available to enforce this restriction: negative lookahead in the date rule, higher-priority military-time rules, etc. But to test proper functioning, we can write tests that might either seek to verify that at least one Mention found in some piece of text has such-and-such properties, or deny that any Mentions have some undesirable configuration of properties.

Complexities arise because MentionTestCase and ArgTestsCase contain multiple subtests (TextTestCase, LabelTestCase, RoleTestCase). What combinations of polarities among subtests (e.g. all positive, all negative, or a mixture of positive and negative) fall within the expected range of use? Consider first the simplest case, corresponding to Entity tests: MentionTestCase not containing any ArgTestCase. Use for a mix of positive and negative tests: (PositiveLabelTestCase, NegativetextTestCase),- make sure the given label is picked up but doesn't correspond to this string; (NegativeLabelTestCase, PositiveTextTestCase)<--make sure this string is not read as... Running with the above example, suppose the parent MentionTestCase has text "12 Jun 2000 hours". The MentionTestCase can either be ExistsMentionTestCase, or ForAllMentionTestCase.

Combinations of positive and negative subtests under ForAllMentionTestCase seem of little use. For this reason, and to avoid possible confusion, we have written into the function definitions the requirement that ForAllMentionTestCase scopes over all-negative immediate subtests. Below is a typical example of the proper use of all-negative subtests for ForAllMentionTestCase:

ForAllMentionTestCase(
    labels = Seq(NegativeLabelTestCase("WhatQuery")), 
    mentionSpan = NegativeTextTestCase("of ostrich feathers"), 
    text = "TEUs of ostrich feathers"
)

Combinations of positive and negative subtests under ExistsMentionTestCase are potentially useful. in keeping with the flexibility outlined for some ArgTestCases below, we allow ExistsMentionTestCase to freely embed either polarity of LabelTestCase and TextTestCase. Example:

ExistsMentionTestCase( 
    labels = Seq(NegativeLabelTestCase("Date")), // wrong label
    mentionSpan = PositiveTextTestCase("LNG Finima"),
    text = "LNG Finima"
)
ExistsMentionTestCase( 
    labels = Seq(PositiveLabelTestCase("Vessel")),
    mentionSpan = NegativeTextTestCase("Finima"), // mentionSpan doesn't match text
    text = "LNG Finima"
)

ArgTestCase polarity adds a further layer of potential complexity. To simplify use, we restrict the polarity of ArgTestCase itself to be strictly Negative under ForAllMentionTestCase, just like all its immediate subtests.: ForAllMentionTestCase(NegativeLabelTestCase, NegativeTextTestCase, NegativeArgTestCase(...)). In this configuration, we restrict NegativeArgTestCase to embed strictly positive subtests: PositiveTextTestCase, PositiveLabelTestCase, and PositiveRoleTestCase . Allowing NegativeArgTestCase to embed negative subtests, or a mix of positive and negative subtests, adds unwanted complexity unrewarded by clear use cases. The example below illustrates the intended use:

ForAllMentionTestCase(
    labels = Seq(NegativeLabelTestCase("CargoQuery"), NegativeLabelTestCase("QuantityQuery")),
    text = "Some zebras are galloping to Scotland from Zimbabwe",
    mentionSpan = NegativeTextTestCase("Some zebras are heading to Scotland from Zimbabwe"),
    args = List( 
        NegativeArgTestCase(
            role = PositiveRoleTestCase("need"),
            labels = Seq(PositiveLabelTestCase("QuantifiedCargo")),
            text = PositiveTextTestCase("zebras")
        )
    )
)

ExistsMentionTestCase may embed PositiveArgTestCase and/or NegativeArgTestCase:

ExistsMentionTestCase(
    labels = Seq(PositiveLabelTestCase("Transport")),
    mentionSpan = PositiveTextTestCase("Frozen food that arrived before September 21st 2020 but after September 28th 2020"),
    text = "Frozen food that arrived before September 21st 2020 but after September 28th 2020.",
    args = List(
        NegativeArgTestCase(
            role = PositiveRoleTestCase("need"),
            labels = Seq(PositiveLabelTestCase("BeforeTime"), PositiveLabelTestCase("TimeExpression")),
            text = PositiveTextTestCase("before September 21st 2020")
        ),
        PositiveArgTestCase(
            role = PositiveRoleTestCase("time"),
            labels = Seq(PositiveLabelTestCase("AfterTime"), PositiveLabelTestCase("TimeExpression")),
            text = PositiveTextTestCase("after September 28th 2020")
        )
    ),
),

To allow desired flexibility with easily-understood uses, we allow the subtests within PositiveArgTestCase (only) to be positive or negative, independently of other subtests within the same PositiveArgTestCase. The example below illustrates:

// Negative Label test; positive Text, Role tests.
ExistsMentionTestCase(
    labels = Seq(PositiveLabelTestCase("Transport")),
    mentionSpan = PositiveTextTestCase("Frozen food that arrived before September 21st 2020 but after September 28th 2020"),
    text = "Frozen food that arrived before September 21st 2020 but after September 28th 2020.",
    args = List(
        PositiveArgTestCase(
            role = PositiveRoleTestCase("time"),
            labels = Seq(NegativeLabelTestCase("AfterTime")),
            text = PositiveTextTestCase("before September 21st 2020")
        ),
    )
),
// Negative Text test; positive Label, Role tests.
ExistsMentionTestCase(
    labels = Seq(PositiveLabelTestCase("Transport")),
    mentionSpan = PositiveTextTestCase("Frozen food that arrived before September 21st 2020 but after September 28th 2020"),
    text = "Frozen food that arrived before September 21st 2020 but after September 28th 2020.",
    args = List(
        PositiveArgTestCase(
            role = PositiveRoleTestCase("time"),
            labels = Seq(PositiveLabelTestCase("BeforeTime")),
            text = NegativeTextTestCase("after September 21st 2020")
        ),
    )
)