4  Methods

4.1 citation searching

Relevant references can be retrieved by analyzing citation relationships between known relevant articles, called seed papers (also known as core papers) and cited or citing references. This technique is called citation searching, citation analysis or citation tracking (Hirt et al. (2024), Belter (2016), Hinde and Spackman (2015)).

The TARCiS Statement provides a guideline for citation searching. (Hirt et al. (2024))

Citation searching can be advisable at the very beginning of a systematic literature search in order to find additional core papers and to identify additional search terms. After finishing the final search it can be used to find additional relevant studies that were not retrieved by systematic searching using search strategies.

Some research questions are hard to search by a conventional Boolean approach, for instance due to very ambiguous search terms or very recent research topics for which no index terms and no established terminology exists, yet. In these cases a citation search can be more useful than a Boolean search as it relies solely on the citation relationships between known and yet unknown publications.

There are different approaches to searching citations: First, in the backward citation searching method the lists of references cited in the seed papers are screened for relevant articles. Second, the analysis of publications citing the seed paper as a reference is called forward citation searching. See Figure 4.1.

Vector drawing showing a timeline with papers above it, illustration their relative time of publication. Arrows point from the most recent publications, which are called 'citing references', to the seed paper in the center, indicating these papers to be citing the seed paper. The citation searching technique based on this relationship is called 'forward citation searching. Similarly, arrows from the seed paper to even older publications, which are called 'cited references', indicate the relationship of the seed paper with the references it cites. The corresponding technique is called 'backwards citation searching'.
Figure 4.1: backward and forward citation searching

Third, the identification of relevant literature by counting co-cited or co-citing references which connect the seed paper with other papers is called co-citation searching. In theory, the relevance of these other papers should increase with an increasing number of connecting publications.

4.1.1 co-cited citation searching

A seed paper (SP) is connected to other relevant publications (ORP) if both the SP and the ORP are cited by the same paper. In this case the SP and the ORP are co-cited by a number of publications. See Figure 4.2.

Figure 4.2: co-cited citation searching retrieves publications which share citations with the seed paper

4.1.2 co-citing citation searching

A similar relationship exists between the SP and ORP if they both share a number of mutual references. In this case they are co-citing a set of common literature. See Figure 4.3.

Figure 4.3: co-citing citation searching retrieves publications which share references with the seed paper

4.2 critical appraisal

One of the main steps in evidence-based medicine is the critical appraisal of evidence. Critical appraisal checklists help to assess the methodological quality of a studies and to determine the extent to which a study has excluded or minimized the possibility of bias in its design, conduct and analysis.

See also: Twells (2021), Buccheri and Sharifi (2017), Fineout-Overholt et al. (2010a), Fineout-Overholt et al. (2010b), Fineout-Overholt et al. (2010c)

4.3 deduplication

Whenever multiple databases are searched it is unavoidable that various references are retrieved more than once due to the overlapping of contents. The removal of duplicate records from a dataset is called deduplication.

Deduplication can be carried out manually using reference management software or in an automated way using review tools. The methods and tools for deduplication differ in their mode of operation and the quality of results (Janka and Metzendorf (2024), McKeown and Mir (2021), Bramer et al. (2016)).

An unwelcome practice, which makes deduplication (and concomitantly any scientific work) more difficult, is repetitive, duplicate or redundant publishing (Ding et al. (2020), Johnson (2006), Kassirer and Angell (1995)).

4.4 filters

Filters (also search filters, filter strategies or hedges) are search strategies designed to retrieve records for a specific concept of the research question.

There are filters for specific patient groups or diseases, outcomes, study types (for instance to retrieve only randomized controlled trials (RCTs), or filters for other aspects of the research question, e.g. adverse effects, diagnostic accuracy or patient values. (Waffenschmidt et al. (2020), Salvador-Oliván, Marco-Cuenca, and Arquero-Avilés (2021), Lee et al. (2012), Golder et al. (2006))

Validated filters are developed, tested and optimized for sensitivity and precision by experts. (Haynes and Wilczynski (2004), Glanville et al. (2008))

Filter strategies can be implemented in a search strategy just like any concept of the research question. Table 4.1 illustrates a search strategy with a short qualitative research filter in lines 8 to 11, which is added to the overall search in line 12.

Table 4.1
1   hypertension/
2   (hypertension or high blood pressure).ti,ab.
3   1 or 2
4   exp patient attitude/
5   *patient satisfaction/
6   (choice$ or empower$).ti.
7   or/4-6
8   interview$.mp.
9   experience$.mp.
10  qualitative.tw.
11  or/8-10
12  3 and 7 and 11

4.5 field code

The data fields of databases possess short designations called field codes, field tags or field labels, which allow to search the fields separately as part of a search query or search strategy.

Depending on the syntax of the database or search interface different field codes are available for searching. If no field code is used in a search query, the search term is usually searched in all fields or a preset variety of fields.

Example

In PubMed the query hypnosis[TIAB] will basically search for records with “hypnosis” in the title or abstract. In Ovid the same search would be written as hypnosis.ti,ab.

See also PubMed Search field tags and Ovid Medline Fields.

4.6 focus topic

Focus topics (also called major topics) are weighted index terms. If a particular topic is at the heart of a publication, index terms for that topic will be assigned as a so-called focus topic. This is often displayed in the databases by writing an asterisk before or after the index term.

Risk of confusion

The asterisk * appears also as a wildcard character for truncation. The two look the same, but have different meanings and uses. Don’t let them confound you.

By labeling an index term this way, its significance for the publication is visualized. Moreover, searching for the focus topic instead of the index term allows for a search to be focused only on the most important papers for a particular topic.

Example
  1. A systematic review on hypertension will most likely feature the focus topic *hypertension, whereas an artikle about vascular diseases might be indexed with hypertension as a normal subject heading. Searching for hypertension[majr] in PubMed or *hypertension/ in Ovid would yield only the first of those two articles. A search for hypertension[mh] hypertension/ would find both of them.

  2. The article Apnoeic oxygenation during paediatric tracheal intubation by Fuchs et al. (2024) is indexed with Intubation, Intratracheal* / adverse effects and Intubation, Intratracheal* / methods as major topics, whereas Hypoxia / etiology was added as a regular MeSH term as it is not the main focus of the article.

4.7 grey literature

Grey literature (also gray literature) is so-called non-conventional or informal literature which “cannot readily be obtained through normal bookselling channels”, for example conference proceedings, reports, specifications, supplementary publications, technical notes, theses, translations.(Wood (1982), Auger (2017))

According to Paez (2017) the search for grey literature can be an important resource for systematic literature searching.

URL Description
https://www.proquest.com/ Dissertations and theses (global)
http://search.ndltd.org/ Dissertations and theses (global)
https://search.worldcat.org/ Dissertations and theses (global)
https://www.dart-europe.org/ Dissertations and theses (EU)
https://www.base-search.net/ Bielefeld Academic Search Engine
https://www.science.gov/ Research results from U.S. federal agencies
https://ntrl.ntis.gov/NTRL/ National Technical Reports Library (US)
https://mednar.com/mednar/desktop/en/search.html free deep web search engine for medical topics
https://wonder.cdc.gov/ Wide-ranging Online Data for Epidemiologic Research
https://www.ahrq.gov/ Agency for Healthcare Research and Quality
https://v2.sherpa.ac.uk/opendoar/ Directory Of Academic Repositories
https://www.greynet.org/ Grey Literature Network Service
doi:10.17026/dans-xtf-47w5 Archive of OpenGrey.eu

4.8 index terms

Index terms, also called subject headings (or sometimes descriptor, DE: Schlagwörter) are expressions defined for the purpose of indexing. They represent content-related concepts and are organized as a controlled vocabulary. By adding index terms to a record it becomes retrievable based on its contents.

Using index terms in a search is an essential part of a systematic literature search. The index term search can be focused by employing subheadings or searching the terms as focus topics.

Index terms should not be confused with free text terms (DE: Stichwörter) or author keywords.

Example

The MeSH term for the concept of heart attack or cardiovascular stroke is Myocardial infarction

The Emtree term (index term in Embase) for the same concept is heart infarction.

4.9 limits

There are various means to restrict the results of a search. One of them are so-called limits which are filter options provided by the search interface, which allow the search results to be limited to characteristics such as publication type, language or year of publication.

Warning
  1. Database limits are not to be mistaken for validated filter strategies. Limits are usually based on certain data fields, whereas validated search filters are more complex search strategies. See also Section 4.4.

  2. Database limits based on index terms may lead to the unintended exclusion of non-indexed records.

  3. The usage of database limits is not always evident from the search history. If it is not possible to display applied limits in the search history, its use should be reported in the documentation.

4.10 nesting

Nesting refers to the use of parentheses ( ) to group search terms within a query. The purpose of nesting a search query is to define the order in which the search terms and operators processed.

Without nesting the order in which the elements of a search query are processed depends on the rules of the respective search interface. This means that the same query might produce very different results in the individual databases.

Examples
  1. Within PubMed all searches are processed in a left-to-right sequence.

Thus the following PubMed queries yield completely different results:

  • exercise[MH] AND infection[MH] OR heart[MH]
  • exercise[MH] AND heart[MH] OR infection[MH]
  • heart[MH] OR infection[MH] AND exercise[MH]

On the other hand the following nested queries are identical:

  • exercise[MH] AND (infection[MH] OR heart[MH])
  • exercise[MH] AND (heart[MH] OR infection[MH])
  • (heart[MH] OR infection[MH]) AND exercise[MH]
  1. Web of Science executes the search in an order of precedence of the operators.

4.11 operators

4.11.1 Boolean operators

Databases usually allow a search to be structured using the three basic operations of Boolean algebra, which are expressed with the Boolean operators AND (conjunction), OR (disjunction) and NOT (negation). The AND operator creates an intersection of sets, OR creates a union of sets and NOT excludes sets (see Figure 4.4).

Figure 4.4: Boolean Operators
Examples
  • The query "heart attack" AND diabetes AND obesity retrieves only records featuring all three terms.
  • The query "cardiac arrest" OR asystole retrieves records containing at least one of the two terms.
  • The query animals NOT humans removes all records mentioning humans from the set animals.
Caution

Using the NOT operator can be dangerous, as it excludes records regardless of any relevant search terms they might contain. The above example excludes also records with animals if they mention humans.

The Boolean operators AND and OR possess similar properties as the multiplication and addition. (O’Regan (2012))

1 (a OR b) = (b OR a)
 (a AND b) = (b AND a)
 ----
2 (a OR b) OR c = a OR (b OR c)
 (a AND b) AND c = a AND (b AND c)
 ----
3 a AND (b OR c) = (a AND b) OR (a AND c)
 a OR (b AND c) = (a OR b) AND (a OR c)
1
commutative property
2
associative property
3
distributive property

4.11.2 frequency operators

Records in which a relevant expression occurs multiple times might be more relevant than records with fewer instances of the same expression. Therefore some search interfaces allow the use of a so-called frequency operator, which only retrieves records only if the search term occurs at least the specified number of times in the searched data field.

Example

Example from Ovid: The query "pharmacy".ab/freq=5 will retrieve articles, in which the term pharmacy occurs at least five times within the abstract.

4.11.3 proximity operators

Many databases allow to search for terms depending on the word distance between each other. The so-called proximity searching search uses a proximity operator in which the allowed distance of the two expressions is defined by a number N, for example adjN or NEAR/N.

Examples
Examples of proximity searching in different resources
PubMed "heart transplant"[TI:~2]
Ovid (heart adj3 transplant).ti
Cochrane Library (heart NEAR/3 transplant):ti
Embase.com (heart NEAR/3 transplant):ti
EBSCOhost TI (heart N2 transplant)
Web of Science TI=(heart NEAR/2 transplant)
Scopus TITLE(heart W/2 transplant)
Please note
  • The number N sometimes defines the amount of words in between te expressions, sometimes it marks the position of the second expression. That is why N is not always the same for every database or interface.

  • In the case of PubMed, truncation is not possible at the same time as proximity searching. In this case, various expressions are necessary to account for transplant as well as transplantation or transplanting.

  • Carefully consider the distance between the expressions. A distance of two words between the search terms (as shown above) is often reasonable, as it covers frequent expressions such as “transplantation of the heart”. However, sometimes it is necessary to include more adjectives in between.

4.12 phrases

Most databases can be searched for verbatim expressions, called phrases or literal strings, by putting search terms in quotation marks " ".

Effects

The use of phrases usually terminates any automatically applied techniques, such as lemmatization, stemming or automated term mapping (ATM) for those expressions. In this way, the use of phrases usually reduces the amount of search results as it makes the search more precise and less sensitive.

Example:

Due to automated term mapping the PubMed query heart arrest will be translated to
"heart arrest"[MeSH Terms] OR "heart"[All Fields] AND "arrest"[All Fields] OR "heart arrest"[All Fields]

However, the query "heart arrest" translates to "heart arrest"[All Fields], because phrases are not automatically mapped in PubMed.

Phrases and Truncation

In some cases the simultaneous use of phrases and truncation is not supported by the search interface.

Searching for "hearing aid*" in the Cochrane Library will prompt an error message suggesting the use of the NEXT operator to work around the problem. In other words, the search query should be hearing NEXT aid* instead.

Required quotation marks

Phrases are absolutely necessary when the search string contains a special character or anything the search interface would interpret as an operator or syntax. Examples for Ovid:

  • "go/no-go".ti,ab. – Without the quotation marks the forward slash / would be understood as command to search the expression go as an index term.
  • "Sensitivity and Specificity"/and is also an operator and must be escaped using the quotation marks.
  • "5".ip – This search retrieves all records with the number 5 in the Issue/Part field. Without the quotation marks, the query would search for the contents of line 5 in the field .ip. (That procedure is called postqualification of search sets).
Straight quotes vs. curly quotes

There are several types of quotation marks. Search interfaces often only understand so-called straight quotes " " (the ones that have been used in typewriters). Modern word processors often change straight quotes automatically into the typographically correct curly quotes “ ” (the way they are printed in books), which can cause a syntax error, if they are copied into a search query. This is a known issue in Ovid.

Options to counter this problem:

  1. Deactivation of the responsible autocorrection feature of the word processor.
  2. Using a simpler plain text editor.

4.13 search strategy

A search strategy is a coherent set of search queries designed to retrieve references for a particular topic. A search strategy is dependent on the syntax and index terms of the searched database. As a consequence, the search strategy needs to be translated for the use in other databases.

Translation of search strategies

There are two main challenges in translating search strategies from one database or search interface to another:

  1. Different syntax. The differences in operators, phrases, nesting or field codes can be managed either by experience, by consulting the resources’ own knowledge databases or by using tools such as the Polyglot Translator. In some cases, there is no objectively clear translation, for instance due to the lack of corresponding syntax. In these cases, the translation is a close approximation.

  2. Different index terms. In contrast to the syntax, which is fairly similar for most of the databases, the translation of index terms can be very difficult. Some databases do not index records, such as the Web of Science Core Collection. Others do assign index terms, but don’t possess a controlled vocabulary of their own, such as Scopus. Some thesauri are more detailed than others or the hierarchy is structured in a different way, so that a 1:1 translation could bear unexpected results.

Search strategies are supposed to be consistent with the predefined eligibility criteria of the research project (Gough, Oliver, and Thomas (2017)).

Depending on the purpose and detail of the search, the extent of a search strategy might range from a simple search string to numerous lines of search terms connected by Boolean operators.

Table 4.2 and Table 4.3 show examples for both types of search strategies.

Table 4.2: Single-line PubMed Search strategy
"analgesics"[MH] OR "analgesic*"[TIAB] OR "analgesic*"[PA] OR "anodynes"[TIAB] 
OR "antinociceptive*"[TIAB]) AND "pain management"[MAJR]
Table 4.3: Multi-line search strategy for Ovid MEDLINE
1 exp analgesics/
2 (analge#ic? or anodyne? or antinoceptive?).ti,ab,kw.
3 *pain management/
4 (1 or 2) and 3
Testing your search strategy

You can use seed papers to test whether your search strategy picks them up. In the first step you search for the seed papers using their unique identifiers such as PMID or PMCID or DOI. Second, you connect the found seed papers with the rest of your search strategy using the Boolean operator AND as shown below in Table 4.4 – if the search results for lines 4 and 5 are the same, all the seed papers were retrieved by the search strategy.

Table 4.4: Testing the search strategy using seed papers.
1 hypertension/ or antihypertensive agents/
2 deprescriptions/
3 1 and 2
4 ("39209778" or "39094592" or "28437544" or "26591140" or "29678989").ui.
5 3 and 4

4.14 subheading

Publications in bibliographic databases are often indexed using index terms. In cases where the index term does not sufficiently describe the content of the article, so-called subheadings or qualifiers can be applied to the index term to specify certain aspects.

Example

The PubMed search query Respiratory Tract Infections/drug therapy[MeSH] (or short Respiratory Tract Infections/dt[mh]) retrieves references focusing on the drug therapy of respiratory tract infections.

Typical subheadings are ´adverse effects´´, ´complications´, ´diagnosis´, ´drug therapy´, ´economics´, ´epidemiology´, ´history´, ´methods´, ´pathology´, ´pharmacology´, ´psychology´, ´standards´, ´therapeutic use´, ´toxicology´.

Sometimes it may be required to retrieve all publications featuring a certain subheading regardless of the index terms, for instance when one is interested in reports of adverse effects. (Golder et al. (2006)) In order to do that the subheading can be searched alone as a so-called floating subheading, for example adverse effects[sh] in PubMed or ae.fs in Ovid MEDLINE.

For more information about subheadings in PubMed see https://pubmed.gov/help/#mesh-subheadings or https://nlm.nih.gov/mesh/qualifiers_scopenotes.html.

4.15 systematic literature searching

Conducting a systematic literature search (also called systematic search) means to search multiple databases for references to relevant articles and studies in a structured, planned and iterative manner. A systematic search is well-documented, transparent and (to a certain degree) reproducible.

Information specialists and librarians

Systematic searches are best conducted or at least accompanied by information specialists or librarians. For more on this topic, see Spencer and Eldredge (2018), Metzendorf (2016), Foster (2015), Koffel (2015), Rethlefsen et al. (2015), Schellinger et al. (2021), Meert, Torabi, and Costella (2016).

The purpose of a systematic literature search is to find relevant literature in the desired degree of completeness. For a systematic review the search should be very sensitive as to find all the relevant evidence. In contrast, the literature search as part of a rapid review or narrative review may still be done systematically, but can be designed in a more precise manner, since in those cases it may be acceptable to miss some relevant articles. (See Section 2.7 about sensitivity and precision).

There are various guidelines which can be followed to conduct systematic searches, such as the Cochrane Handbook (Higgins et al. (2022)) or the JBI Manual (Aromataris and Munn (2020)). Depending on the desired type of review (e.g. systematic, scoping, narrative, …, see Munn, Peters, et al. (2018), Munn, Stern, et al. (2018) or Pearson et al. (2015))

important steps of a systematic search
  1. definition of a research question
  2. selection of databases
  3. compilation of search terms
  4. setup of an initial search
  5. iterative refinement of the search strategy
  6. peer review of the search strategy
  7. translation to other databases
  8. final search and export of results
  9. documentation of the search

Various methods are applied in the design of search strategies. Keywords and index terms are connected using operators. The search queries are structured by nesting. The precision and sensitivity of the search is adjusted using truncation, by the choice of search fields or the use of term explosion.

A systematic search is best peer-reviewed by colleagues (e.g. by information specialists, other researchers, or fellow students), following the PRESS guideline by McGowan et al. (2016).

Ideally, the systematic search is documented according the PRISMA-S extension by Rethlefsen et al. (2021).

4.16 term explosion

Index terms in databases are usually hierarchically organized, starting with very broad terms in the top categories to very specific terms further down the branches of the hierarchy.

When searching using an index term, it is often possible to include all subordinate index terms in the search. This simultaneous search option is called term explosion.

Warning

Keep in mind that depending on the search interface or database, the term explosion might be active by default, as it is the case in PubMed. This can be avoided by using the field codes [mh:noexp] or [majr:noexp] instead of [mh] or [majr].

Examples
  • The PubMed search query Psychotherapy[MH] automatically retrieves records featuring the MeSH term psychotherapy as well as aromatherapy, behaviour therapy, crisis intervention, hypnosis, or logotherapy, because in PubMed the term explosion is active by default. However, the query Psychotherapy[MH:NoExp] retrieves only records indexed with psychotherapy.

  • In Ovid MEDLINE, the search for psychotherapy/ looks only for references indexed with psychotherapy, whereas the query exp psychotherapy/ searches the exploded term, i.e. it uses the subordinate index terms such as aromatherapy, crisis intervention, etc. in the search, too.

4.17 truncation

In general, to truncate means to reduce something down to its trunk, to shorten something, to cut something off.

In literature searching truncation is a technique that allows to search several variations of a term at once by searching with a reduced (truncated) version of the search term. A term is truncated by replacing one or more characters with a so-called wildcard character, such as *, $, ?, #.

The most common form of truncation is the unlimited truncation (usually using the asterisk * or the dollar sign $) at the end of the search term. In some databases it is possible to limit the truncation to a fixed number of characters or to replace just one character. The kinds of truncation that are available depend on the syntax of the database or search interface.

Table 4.5: Truncation examples in Ovid
query covered search terms
discolo$ discolor, discolors, discolored, discolouring, discolorations, …
discolo$3 discolor, discolors, discolored
colo$r color, colour, colorimeter, colonizer, coloarticular, …
colo?r color, colour
wom#n woman, women, womon, womxn, womyn

Usually only free text terms are truncated. While it might be possible to truncate index terms, it is not very practical; truncated index terms are rarely used meaningfully, for instance in valiated search filters, such as the search term diagnostic*[MeSH:noexp] in the diagnosis filter by Kastner et al. (2009).