π Jaccard Index WordsΒΆ
- JaccardIndex metric that operates on prediction and references that are strings.
It first splits the the string into words using space as a separator.
For each prediction, it calculates the ratio Intersect(prediction_words,reference_words)/Union(prediction_words,reference_words). If multiple references exist, it takes the best ratio achieved by one of the references.
metrics.jaccard_index_words
JaccardIndexString(
splitter=RegexSplit(
by="\s+",
),
)
[source]from unitxt.string_operators import RegexSplit
Explanation about JaccardIndexStringΒΆ
Calculates JaccardIndex on strings.
Requires setting the βsplitterβ to a FieldOperator (such as Split or RegexSplit) to tokenize the predictions and references into lists of strings tokens.
These tokens are passed to the JaccardIndex as lists.
Read more about catalog usage here.