Skip to content

evaluation_factory

Evaluation factory module.

This module is responsible to provide an evaluation factory, which is responsible to create an evaluation object with the metrics of a search string generated by SeSG.

Evaluation dataclass

Evaluation of a search string.

Parameters:

Name Type Description Default
n_scopus_results int

Number of results returned by Scopus.

required
gs_size int

Size of the gold standard.

required
qgs_in_scopus list[Study]

QGS studies that were found in Scopus.

field(default_factory=list)
gs_in_scopus list[Study]

GS studies that were found in Scopus.

field(default_factory=list)
gs_in_bsb list[Study]

GS studies that were found via backward snowballing.

field(default_factory=list)
gs_in_sb list[Study]

GS studies that were found via backward and forward snowballing.

field(default_factory=list)
Source code in src/sesg/evaluation/evaluation_factory.py
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
@dataclass(frozen=True)
class Evaluation:
    """Evaluation of a search string.

    Args:
        n_scopus_results (int): Number of results returned by Scopus.
        gs_size (int): Size of the gold standard.
        qgs_in_scopus (list[Study]): QGS studies that were found in Scopus.
        gs_in_scopus (list[Study]): GS studies that were found in Scopus.
        gs_in_bsb (list[Study]): GS studies that were found via backward snowballing.
        gs_in_sb (list[Study]): GS studies that were found via backward and forward snowballing.
    """  # noqa: E501

    n_scopus_results: int
    gs_size: int

    qgs_in_scopus: list[Study] = field(default_factory=list)
    gs_in_scopus: list[Study] = field(default_factory=list)
    gs_in_bsb: list[Study] = field(default_factory=list)
    gs_in_sb: list[Study] = field(default_factory=list)

    @cached_property
    def start_set_precision(self) -> float:
        """Start set precision.

        Ratio between the number of GS studies found in Scopus and the number of Scopus results.

        If the number of Scopus results is 0, then the precision is 0.
        """  # noqa: E501
        if self.n_scopus_results == 0:
            return 0

        return len(self.gs_in_scopus) / self.n_scopus_results

    @cached_property
    def start_set_recall(self) -> float:
        """Start set recall.

        Ratio between the number of GS studies found in Scopus and the size of the GS.
        """
        return len(self.gs_in_scopus) / self.gs_size

    @cached_property
    def start_set_f1_score(self) -> float:
        """Start set F1 score.

        A balanced metric between precision and recall.

        If both the start set precision and recall are 0, then the F1 score is 0.
        """
        precision = self.start_set_precision
        recall = self.start_set_recall

        numerator = 2 * precision * recall
        denominator = precision + recall

        if denominator == 0:
            return 0

        return numerator / denominator

    @cached_property
    def bsb_recall(self) -> float:
        """Recall considering backward snowballing.

        Ratio between the number of GS studies found in backward snowballing and the size of the GS.
        """  # noqa: E501
        return len(self.gs_in_bsb) / self.gs_size

    @cached_property
    def sb_recall(self) -> float:
        """Recall considering backward and forward snowballing.

        Ratio between the number of GS studies found in backward and forward snowballing and the size of the GS.
        """  # noqa: E501
        return len(self.gs_in_sb) / self.gs_size

bsb_recall: float property cached

Recall considering backward snowballing.

Ratio between the number of GS studies found in backward snowballing and the size of the GS.

sb_recall: float property cached

Recall considering backward and forward snowballing.

Ratio between the number of GS studies found in backward and forward snowballing and the size of the GS.

start_set_f1_score: float property cached

Start set F1 score.

A balanced metric between precision and recall.

If both the start set precision and recall are 0, then the F1 score is 0.

start_set_precision: float property cached

Start set precision.

Ratio between the number of GS studies found in Scopus and the number of Scopus results.

If the number of Scopus results is 0, then the precision is 0.

start_set_recall: float property cached

Start set recall.

Ratio between the number of GS studies found in Scopus and the size of the GS.

EvaluationFactory dataclass

Evaluation factory.

To evaluate a search string, use the evaluate method.

Parameters:

Name Type Description Default
gs list[Study]

Gold standard.

required
qgs list[Study]

Quasi gold standard.

required
Source code in src/sesg/evaluation/evaluation_factory.py
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
@dataclass(frozen=True)
class EvaluationFactory:
    """Evaluation factory.

    To evaluate a search string, use the [`evaluate`][sesg.evaluation.evaluation_factory.EvaluationFactory.evaluate] method.

    Args:
        gs (list[Study]): Gold standard.
        qgs (list[Study]): Quasi gold standard.
    """  # noqa: E501

    gs: list[Study]
    qgs: list[Study]

    @cached_property
    def processed_gs_titles(self) -> list[str]:
        """Preprocessed GS titles."""
        return [s.processed_title for s in self.gs]

    @cached_property
    def processed_qgs_titles(self) -> list[str]:
        """Preprocessed QGS titles."""
        return [s.processed_title for s in self.qgs]

    @cached_property
    def studies_dict(self) -> dict[int, Study]:
        """Dictionary mapping a study ID to a study."""
        return {s.id: s for s in self.gs}

    def _get_study_by_id(self, id: int) -> Study:
        return self.studies_dict[id]

    @cached_property
    def directed_adjacency_list(self) -> dict[int, list[int]]:
        """Directed adjacency list of the GS."""
        return get_directed_adjacency_list_from_gs(self.gs)

    @cached_property
    def undirected_adjacency_list(self) -> dict[int, list[int]]:
        """Undirected adjacency list of the GS."""
        return directed_adjacency_list_to_undirected(self.directed_adjacency_list)

    def get_qgs_in_scopus(
        self,
        processed_scopus_titles: list[str],
    ) -> list[Study]:
        """Get QGS studies that were found in Scopus."""
        qgs_in_scopus = similarity_score(
            small_set=self.processed_qgs_titles,
            other_set=processed_scopus_titles,
        )

        return [self.qgs[id] for id, _ in qgs_in_scopus]

    def get_gs_in_scopus(
        self,
        processed_scopus_titles: list[str],
    ) -> list[Study]:
        """Get GS studies that were found in Scopus."""
        gs_in_scopus = similarity_score(
            small_set=self.processed_gs_titles,
            other_set=processed_scopus_titles,
        )

        return [self.gs[id] for id, _ in gs_in_scopus]

    def get_gs_in_bsb(
        self,
        gs_in_scopus: list[Study],
    ) -> list[Study]:
        """Get GS studies that were found via backward snowballing."""
        gs_in_bsb = snowballing(
            adjacency_list=self.directed_adjacency_list,
            start_set=[s.id for s in gs_in_scopus],
        )

        return [self._get_study_by_id(id) for id in gs_in_bsb]

    def get_gs_in_sb(
        self,
        gs_in_scopus: list[Study],
    ) -> list[Study]:
        """Get GS studies that were found via backward or forward snowballing."""
        gs_in_bsb = snowballing(
            adjacency_list=self.undirected_adjacency_list,
            start_set=[s.id for s in gs_in_scopus],
        )

        return [self._get_study_by_id(id) for id in gs_in_bsb]

    def evaluate(
        self,
        scopus_results: list[str],
    ) -> Evaluation:
        """Evaluate the performance of a search string using the results returned by Scopus.

        Args:
            scopus_results (list[str]): List with the titles of the studies returned by Scopus.

        Returns:
            An object with the evaluation metrics.
        """  # noqa: E501
        processed_scopus_titles = [process_title(title) for title in scopus_results]

        qgs_in_scopus = self.get_qgs_in_scopus(processed_scopus_titles)
        gs_in_scopus = self.get_gs_in_scopus(processed_scopus_titles)
        gs_in_bsb = self.get_gs_in_bsb(gs_in_scopus)
        gs_in_sb = self.get_gs_in_sb(gs_in_scopus)

        return Evaluation(
            qgs_in_scopus=qgs_in_scopus,
            gs_in_scopus=gs_in_scopus,
            gs_in_bsb=gs_in_bsb,
            gs_in_sb=gs_in_sb,
            gs_size=len(self.gs),
            n_scopus_results=len(scopus_results),
        )

directed_adjacency_list: dict[int, list[int]] property cached

Directed adjacency list of the GS.

processed_gs_titles: list[str] property cached

Preprocessed GS titles.

processed_qgs_titles: list[str] property cached

Preprocessed QGS titles.

studies_dict: dict[int, Study] property cached

Dictionary mapping a study ID to a study.

undirected_adjacency_list: dict[int, list[int]] property cached

Undirected adjacency list of the GS.

evaluate(scopus_results)

Evaluate the performance of a search string using the results returned by Scopus.

Parameters:

Name Type Description Default
scopus_results list[str]

List with the titles of the studies returned by Scopus.

required

Returns:

Type Description
Evaluation

An object with the evaluation metrics.

Source code in src/sesg/evaluation/evaluation_factory.py
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
def evaluate(
    self,
    scopus_results: list[str],
) -> Evaluation:
    """Evaluate the performance of a search string using the results returned by Scopus.

    Args:
        scopus_results (list[str]): List with the titles of the studies returned by Scopus.

    Returns:
        An object with the evaluation metrics.
    """  # noqa: E501
    processed_scopus_titles = [process_title(title) for title in scopus_results]

    qgs_in_scopus = self.get_qgs_in_scopus(processed_scopus_titles)
    gs_in_scopus = self.get_gs_in_scopus(processed_scopus_titles)
    gs_in_bsb = self.get_gs_in_bsb(gs_in_scopus)
    gs_in_sb = self.get_gs_in_sb(gs_in_scopus)

    return Evaluation(
        qgs_in_scopus=qgs_in_scopus,
        gs_in_scopus=gs_in_scopus,
        gs_in_bsb=gs_in_bsb,
        gs_in_sb=gs_in_sb,
        gs_size=len(self.gs),
        n_scopus_results=len(scopus_results),
    )

get_gs_in_bsb(gs_in_scopus)

Get GS studies that were found via backward snowballing.

Source code in src/sesg/evaluation/evaluation_factory.py
289
290
291
292
293
294
295
296
297
298
299
def get_gs_in_bsb(
    self,
    gs_in_scopus: list[Study],
) -> list[Study]:
    """Get GS studies that were found via backward snowballing."""
    gs_in_bsb = snowballing(
        adjacency_list=self.directed_adjacency_list,
        start_set=[s.id for s in gs_in_scopus],
    )

    return [self._get_study_by_id(id) for id in gs_in_bsb]

get_gs_in_sb(gs_in_scopus)

Get GS studies that were found via backward or forward snowballing.

Source code in src/sesg/evaluation/evaluation_factory.py
301
302
303
304
305
306
307
308
309
310
311
def get_gs_in_sb(
    self,
    gs_in_scopus: list[Study],
) -> list[Study]:
    """Get GS studies that were found via backward or forward snowballing."""
    gs_in_bsb = snowballing(
        adjacency_list=self.undirected_adjacency_list,
        start_set=[s.id for s in gs_in_scopus],
    )

    return [self._get_study_by_id(id) for id in gs_in_bsb]

get_gs_in_scopus(processed_scopus_titles)

Get GS studies that were found in Scopus.

Source code in src/sesg/evaluation/evaluation_factory.py
277
278
279
280
281
282
283
284
285
286
287
def get_gs_in_scopus(
    self,
    processed_scopus_titles: list[str],
) -> list[Study]:
    """Get GS studies that were found in Scopus."""
    gs_in_scopus = similarity_score(
        small_set=self.processed_gs_titles,
        other_set=processed_scopus_titles,
    )

    return [self.gs[id] for id, _ in gs_in_scopus]

get_qgs_in_scopus(processed_scopus_titles)

Get QGS studies that were found in Scopus.

Source code in src/sesg/evaluation/evaluation_factory.py
265
266
267
268
269
270
271
272
273
274
275
def get_qgs_in_scopus(
    self,
    processed_scopus_titles: list[str],
) -> list[Study]:
    """Get QGS studies that were found in Scopus."""
    qgs_in_scopus = similarity_score(
        small_set=self.processed_qgs_titles,
        other_set=processed_scopus_titles,
    )

    return [self.qgs[id] for id, _ in qgs_in_scopus]

Study dataclass

Represents a study.

Parameters:

Name Type Description Default
id int

Study's ID.

required
title str

Study's title.

required
references list[Study]

Study's references. If None, defaults to an empty list.

field(default_factory=list)
Source code in src/sesg/evaluation/evaluation_factory.py
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
@dataclass(unsafe_hash=True)
class Study:
    """Represents a study.

    Args:
        id (int): Study's ID.
        title (str): Study's title.
        references (list[Study]): Study's references. If None, defaults to an empty list.
    """  # noqa: E501

    id: int
    title: str

    references: list["Study"] = field(default_factory=list)

    @cached_property
    def processed_title(self):
        """Preprocessed title."""
        return process_title(self.title)

processed_title property cached

Preprocessed title.

get_directed_adjacency_list_from_gs(gs)

Creates a directed adjacency list from a gold standard set of studies.

Parameters:

Name Type Description Default
gs list[Study]

Set of studies that compose the GS.

required

Returns:

Type Description
dict[int, list[int]]

A dictionary mapping a study ID to it's references.

Source code in src/sesg/evaluation/evaluation_factory.py
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
def get_directed_adjacency_list_from_gs(
    gs: list["Study"],
) -> dict[int, list[int]]:
    """Creates a directed adjacency list from a gold standard set of studies.

    Args:
        gs (list[Study]): Set of studies that compose the GS.

    Returns:
        A dictionary mapping a study ID to it's references.
    """
    adjacency_list = {}

    for study in gs:
        adjacency_list[study.id] = [ref.id for ref in study.references]

    return adjacency_list

process_title(string)

Strips the string and turn every character to lower case.

Parameters:

Name Type Description Default
string str

The string to preprocess.

required

Returns:

Type Description
str

The preprocessed string.

Examples:

>>> process_title(" A string Here.  \n")
'a string here.'
Source code in src/sesg/evaluation/evaluation_factory.py
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
def process_title(
    string: str,
) -> str:
    r"""Strips the string and turn every character to lower case.

    Args:
        string: The string to preprocess.

    Returns:
        The preprocessed string.

    Examples:
        >>> process_title(" A string Here.  \n")
        'a string here.'
    """
    return string.strip().lower()

similarity_score(small_set, other_set)

Uses TfidfVectorizer, cosine_similarity, and Levenshtein to calculate the intersection of two sets of strings.

You might need to preprocess the strings with process_title.

Parameters:

Name Type Description Default
small_set list[str]

Set of strings. If possible, the length of this set should be smaller than the other one.

required
other_set list[str]

Set of strings to compare against.

required

Returns:

Type Description
list[tuple[int, int]]

List of tuples, where the tuple (i, j) means that small_set[i] is similar to other_set[j].

Examples:

>>> small_set = ["machine learning", "databases", "search strings"]
>>> other_set = ["Databases, an introduction", "Machine Learning", "Search String"]
>>> similarity_score(
...     small_set=small_set,
...     other_set=other_set
... )
[(0, 1), (2, 2)]
Source code in src/sesg/evaluation/evaluation_factory.py
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def similarity_score(
    small_set: list[str],
    other_set: list[str],
) -> list[tuple[int, int]]:
    """Uses `TfidfVectorizer`, `cosine_similarity`, and `Levenshtein` to calculate the intersection of two sets of strings.

    You might need to preprocess the strings with [`process_title`][sesg.evaluation.evaluation_factory.process_title].

    Args:
        small_set (list[str]): Set of strings. If possible, the length of this set should be smaller than the other one.
        other_set (list[str]): Set of strings to compare against.

    Returns:
        List of tuples, where the tuple `(i, j)` means that `small_set[i]` is similar to `other_set[j]`.

    Examples:
        >>> small_set = ["machine learning", "databases", "search strings"]
        >>> other_set = ["Databases, an introduction", "Machine Learning", "Search String"]
        >>> similarity_score(
        ...     small_set=small_set,
        ...     other_set=other_set
        ... )
        [(0, 1), (2, 2)]
    """  # noqa: E501
    if len(other_set) == 0:
        return []

    train_set = [*small_set, *other_set]

    tfidf_vectorizer = TfidfVectorizer()
    tfidf_matrix = tfidf_vectorizer.fit_transform(train_set)

    first_set_matrix = tfidf_matrix[0 : len(small_set)]
    second_set_matrix = tfidf_matrix[len(small_set) : len(small_set) + len(other_set)]

    similarity_matrix = cosine_similarity(
        first_set_matrix,
        second_set_matrix,
    )

    lines: int
    lines, _ = similarity_matrix.shape

    similars: list[tuple[int, int]] = []

    for index_of_first_set_element in range(lines):
        # contains the row of the similarity matrix for the current element
        line = similarity_matrix[index_of_first_set_element]

        index_of_closest_element_in_second_set: int = argsort(line)[-1]

        first_set_element = small_set[index_of_first_set_element]
        second_set_element = other_set[index_of_closest_element_in_second_set]

        distance = Levenshtein.distance(
            first_set_element,
            second_set_element,
            score_cutoff=10,
        )

        if distance < 10:
            similars.append(
                (index_of_first_set_element, index_of_closest_element_in_second_set)
            )

    return similars