4 Methods
4.1 citation searching
Relevant references can be retrieved by analyzing citation relationships between known relevant articles, called seed papers (also known as core papers) and cited or citing references. This technique is called citation searching, citation analysis or citation tracking (Hirt et al. (2024), Belter (2016), Hinde and Spackman (2015)).
The TARCiS Statement provides a guideline for citation searching. (Hirt et al. (2024))
Citation searching can be advisable at the very beginning of a systematic literature search in order to find additional core papers and to identify additional search terms. After finishing the final search it can be used to find additional relevant studies that were not retrieved by systematic searching using search strategies.
Some research questions are hard to search by a conventional Boolean approach, for instance due to very ambiguous search terms or very recent research topics for which no index terms and no established terminology exists, yet. In these cases a citation search can be more useful than a Boolean search as it relies solely on the citation relationships between known and yet unknown publications.
There are different approaches to searching citations: First, in the backward citation searching method the lists of references cited in the seed papers are screened for relevant articles. Second, the analysis of publications citing the seed paper as a reference is called forward citation searching. See Figure 4.1.
Third, the identification of relevant literature by counting co-cited or co-citing references which connect the seed paper with other papers is called co-citation searching. In theory, the relevance of these other papers should increase with an increasing number of connecting publications.
4.1.1 co-cited citation searching
A seed paper (SP) is connected to other relevant publications (ORP) if both the SP and the ORP are cited by the same paper. In this case the SP and the ORP are co-cited by a number of publications. See Figure 4.2.
4.1.2 co-citing citation searching
A similar relationship exists between the SP and ORP if they both share a number of mutual references. In this case they are co-citing a set of common literature. See Figure 4.3.
4.2 critical appraisal
One of the main steps in evidence-based medicine is the critical appraisal of evidence. Critical appraisal checklists help to assess the methodological quality of a studies and to determine the extent to which a study has excluded or minimized the possibility of bias in its design, conduct and analysis.
See also: Twells (2021), Buccheri and Sharifi (2017), Fineout-Overholt et al. (2010a), Fineout-Overholt et al. (2010b), Fineout-Overholt et al. (2010c)
4.3 deduplication
Whenever multiple databases are searched it is unavoidable that various references are retrieved more than once due to the overlapping of contents. The removal of duplicate records from a dataset is called deduplication.
Deduplication can be carried out manually using reference management software or in an automated way using review tools. The methods and tools for deduplication differ in their mode of operation and the quality of results (Janka and Metzendorf (2024), McKeown and Mir (2021), Bramer et al. (2016)).
An unwelcome practice, which makes deduplication (and concomitantly any scientific work) more difficult, is repetitive, duplicate or redundant publishing (Ding et al. (2020), Johnson (2006), Kassirer and Angell (1995)).
4.4 filters
Filters (also search filters, filter strategies or hedges) are search strategies designed to retrieve records for a specific concept of the research question.
There are filters for specific patient groups or diseases, outcomes, study types (for instance to retrieve only randomized controlled trials (RCTs), or filters for other aspects of the research question, e.g. adverse effects, diagnostic accuracy or patient values. (Waffenschmidt et al. (2020), Salvador-Oliván, Marco-Cuenca, and Arquero-Avilés (2021), Lee et al. (2012), Golder et al. (2006))
Validated filters are developed, tested and optimized for sensitivity and precision by experts. (Haynes and Wilczynski (2004), Glanville et al. (2008))
Filter strategies can be implemented in a search strategy just like any concept of the research question. Table 4.1 illustrates a search strategy with a short qualitative research filter in lines 8 to 11, which is added to the overall search in line 12.
1 hypertension/
2 (hypertension or high blood pressure).ti,ab.
3 1 or 2
4 exp patient attitude/
5 *patient satisfaction/
6 (choice$ or empower$).ti.
7 or/4-6
8 interview$.mp.
9 experience$.mp.
10 qualitative.tw.
11 or/8-10
12 3 and 7 and 11
4.5 field code
The data fields of databases possess short designations called field codes, field tags or field labels, which allow to search the fields separately as part of a search query or search strategy.
Depending on the syntax of the database or search interface different field codes are available for searching. If no field code is used in a search query, the search term is usually searched in all fields or a preset variety of fields.
In PubMed the query hypnosis[TIAB]
will basically search for records with “hypnosis” in the title or abstract. In Ovid the same search would be written as hypnosis.ti,ab.
See also PubMed Search field tags and Ovid Medline Fields.
4.6 focus topic
Focus topics (also called major topics) are weighted index terms. If a particular topic is at the heart of a publication, index terms for that topic will be assigned as a so-called focus topic. This is often displayed in the databases by writing an asterisk before or after the index term.
The asterisk *
appears also as a wildcard character for truncation. The two look the same, but have different meanings and uses. Don’t let them confound you.
By labeling an index term this way, its significance for the publication is visualized. Moreover, searching for the focus topic instead of the index term allows for a search to be focused only on the most important papers for a particular topic.
A systematic review on hypertension will most likely feature the focus topic
*hypertension
, whereas an artikle about vascular diseases might be indexed withhypertension
as a normal subject heading. Searching forhypertension[majr]
in PubMed or*hypertension/
in Ovid would yield only the first of those two articles. A search forhypertension[mh]
hypertension/
would find both of them.The article Apnoeic oxygenation during paediatric tracheal intubation by Fuchs et al. (2024) is indexed with
Intubation, Intratracheal* / adverse effects
andIntubation, Intratracheal* / methods
as major topics, whereasHypoxia / etiology
was added as a regular MeSH term as it is not the main focus of the article.
4.7 grey literature
Grey literature (also gray literature) is so-called non-conventional or informal literature which “cannot readily be obtained through normal bookselling channels”, for example conference proceedings, reports, specifications, supplementary publications, technical notes, theses, translations.(Wood (1982), Auger (2017))
According to Paez (2017) the search for grey literature can be an important resource for systematic literature searching.
URL | Description |
---|---|
https://www.proquest.com/ | Dissertations and theses (global) |
http://search.ndltd.org/ | Dissertations and theses (global) |
https://search.worldcat.org/ | Dissertations and theses (global) |
https://www.dart-europe.org/ | Dissertations and theses (EU) |
https://www.base-search.net/ | Bielefeld Academic Search Engine |
https://www.science.gov/ | Research results from U.S. federal agencies |
https://ntrl.ntis.gov/NTRL/ | National Technical Reports Library (US) |
https://mednar.com/mednar/desktop/en/search.html | free deep web search engine for medical topics |
https://wonder.cdc.gov/ | Wide-ranging Online Data for Epidemiologic Research |
https://www.ahrq.gov/ | Agency for Healthcare Research and Quality |
https://v2.sherpa.ac.uk/opendoar/ | Directory Of Academic Repositories |
https://www.greynet.org/ | Grey Literature Network Service |
doi:10.17026/dans-xtf-47w5 | Archive of OpenGrey.eu |
4.8 index terms
Index terms, also called subject headings (or sometimes descriptor, DE: Schlagwörter) are expressions defined for the purpose of indexing. They represent content-related concepts and are organized as a controlled vocabulary. By adding index terms to a record it becomes retrievable based on its contents.
Using index terms in a search is an essential part of a systematic literature search. The index term search can be focused by employing subheadings or searching the terms as focus topics.
Index terms should not be confused with free text terms (DE: Stichwörter) or author keywords.
The MeSH term for the concept of heart attack or cardiovascular stroke is Myocardial infarction
The Emtree term (index term in Embase) for the same concept is heart infarction
.
4.9 limits
There are various means to restrict the results of a search. One of them are so-called limits which are filter options provided by the search interface, which allow the search results to be limited to characteristics such as publication type, language or year of publication.
Database limits are not to be mistaken for validated filter strategies. Limits are usually based on certain data fields, whereas validated search filters are more complex search strategies. See also Section 4.4.
Database limits based on index terms may lead to the unintended exclusion of non-indexed records.
The usage of database limits is not always evident from the search history. If it is not possible to display applied limits in the search history, its use should be reported in the documentation.
4.10 nesting
Nesting refers to the use of parentheses ( )
to group search terms within a query. The purpose of nesting a search query is to define the order in which the search terms and operators processed.
Without nesting the order in which the elements of a search query are processed depends on the rules of the respective search interface. This means that the same query might produce very different results in the individual databases.
- Within PubMed all searches are processed in a left-to-right sequence.
Thus the following PubMed queries yield completely different results:
exercise[MH] AND infection[MH] OR heart[MH]
exercise[MH] AND heart[MH] OR infection[MH]
heart[MH] OR infection[MH] AND exercise[MH]
On the other hand the following nested queries are identical:
exercise[MH] AND (infection[MH] OR heart[MH])
exercise[MH] AND (heart[MH] OR infection[MH])
(heart[MH] OR infection[MH]) AND exercise[MH]
- Web of Science executes the search in an order of precedence of the operators.
4.11 operators
4.11.1 Boolean operators
Databases usually allow a search to be structured using the three basic operations of Boolean algebra, which are expressed with the Boolean operators AND
(conjunction), OR
(disjunction) and NOT
(negation). The AND
operator creates an intersection of sets, OR
creates a union of sets and NOT
excludes sets (see Figure 4.4).
- The query
"heart attack" AND diabetes AND obesity
retrieves only records featuring all three terms. - The query
"cardiac arrest" OR asystole
retrieves records containing at least one of the two terms. - The query
animals NOT humans
removes all records mentioning humans from the setanimals
.
Using the NOT
operator can be dangerous, as it excludes records regardless of any relevant search terms they might contain. The above example excludes also records with animals if they mention humans.
The Boolean operators AND
and OR
possess similar properties as the multiplication and addition. (O’Regan (2012))
4.11.2 frequency operators
Records in which a relevant expression occurs multiple times might be more relevant than records with fewer instances of the same expression. Therefore some search interfaces allow the use of a so-called frequency operator, which only retrieves records only if the search term occurs at least the specified number of times in the searched data field.
Example from Ovid: The query "pharmacy".ab/freq=5
will retrieve articles, in which the term pharmacy
occurs at least five times within the abstract.
4.11.3 proximity operators
Many databases allow to search for terms depending on the word distance between each other. The so-called proximity searching search uses a proximity operator in which the allowed distance of the two expressions is defined by a number N, for example adjN or NEAR/N.
PubMed | "heart transplant"[TI:~2] |
Ovid | (heart adj3 transplant).ti |
Cochrane Library | (heart NEAR/3 transplant):ti |
Embase.com | (heart NEAR/3 transplant):ti |
EBSCOhost | TI (heart N2 transplant) |
Web of Science | TI=(heart NEAR/2 transplant) |
Scopus | TITLE(heart W/2 transplant) |
The number N sometimes defines the amount of words in between te expressions, sometimes it marks the position of the second expression. That is why N is not always the same for every database or interface.
In the case of PubMed, truncation is not possible at the same time as proximity searching. In this case, various expressions are necessary to account for transplant as well as transplantation or transplanting.
Carefully consider the distance between the expressions. A distance of two words between the search terms (as shown above) is often reasonable, as it covers frequent expressions such as “transplantation of the heart”. However, sometimes it is necessary to include more adjectives in between.
4.12 phrases
Most databases can be searched for verbatim expressions, called phrases or literal strings, by putting search terms in quotation marks " "
.
The use of phrases usually terminates any automatically applied techniques, such as lemmatization, stemming or automated term mapping (ATM) for those expressions. In this way, the use of phrases usually reduces the amount of search results as it makes the search more precise and less sensitive.
Example:
Due to automated term mapping the PubMed query heart arrest
will be translated to
"heart arrest"[MeSH Terms] OR "heart"[All Fields] AND "arrest"[All Fields] OR "heart arrest"[All Fields]
However, the query "heart arrest"
translates to "heart arrest"[All Fields]
, because phrases are not automatically mapped in PubMed.
In some cases the simultaneous use of phrases and truncation is not supported by the search interface.
Searching for "hearing aid*"
in the Cochrane Library will prompt an error message suggesting the use of the NEXT
operator to work around the problem. In other words, the search query should be hearing NEXT aid*
instead.
Phrases are absolutely necessary when the search string contains a special character or anything the search interface would interpret as an operator or syntax. Examples for Ovid:
"go/no-go".ti,ab.
– Without the quotation marks the forward slash/
would be understood as command to search the expressiongo
as an index term."Sensitivity and Specificity"/
–and
is also an operator and must be escaped using the quotation marks."5".ip
– This search retrieves all records with the number 5 in the Issue/Part field. Without the quotation marks, the query would search for the contents of line 5 in the field.ip
. (That procedure is called postqualification of search sets).
There are several types of quotation marks. Search interfaces often only understand so-called straight quotes " "
(the ones that have been used in typewriters). Modern word processors often change straight quotes automatically into the typographically correct curly quotes “ ”
(the way they are printed in books), which can cause a syntax error, if they are copied into a search query. This is a known issue in Ovid.
Options to counter this problem:
- Deactivation of the responsible autocorrection feature of the word processor.
- Using a simpler plain text editor.
4.13 search strategy
A search strategy is a coherent set of search queries designed to retrieve references for a particular topic. A search strategy is dependent on the syntax and index terms of the searched database. As a consequence, the search strategy needs to be translated for the use in other databases.
There are two main challenges in translating search strategies from one database or search interface to another:
Different syntax. The differences in operators, phrases, nesting or field codes can be managed either by experience, by consulting the resources’ own knowledge databases or by using tools such as the Polyglot Translator. In some cases, there is no objectively clear translation, for instance due to the lack of corresponding syntax. In these cases, the translation is a close approximation.
Different index terms. In contrast to the syntax, which is fairly similar for most of the databases, the translation of index terms can be very difficult. Some databases do not index records, such as the Web of Science Core Collection. Others do assign index terms, but don’t possess a controlled vocabulary of their own, such as Scopus. Some thesauri are more detailed than others or the hierarchy is structured in a different way, so that a 1:1 translation could bear unexpected results.
Search strategies are supposed to be consistent with the predefined eligibility criteria of the research project (Gough, Oliver, and Thomas (2017)).
Depending on the purpose and detail of the search, the extent of a search strategy might range from a simple search string to numerous lines of search terms connected by Boolean operators.
Table 4.2 and Table 4.3 show examples for both types of search strategies.
"analgesics"[MH] OR "analgesic*"[TIAB] OR "analgesic*"[PA] OR "anodynes"[TIAB] OR "antinociceptive*"[TIAB]) AND "pain management"[MAJR]
1 exp analgesics/
2 (analge#ic? or anodyne? or antinoceptive?).ti,ab,kw.
3 *pain management/
4 (1 or 2) and 3
You can use seed papers to test whether your search strategy picks them up. In the first step you search for the seed papers using their unique identifiers such as PMID or PMCID or DOI. Second, you connect the found seed papers with the rest of your search strategy using the Boolean operator AND
as shown below in Table 4.4 – if the search results for lines 4 and 5 are the same, all the seed papers were retrieved by the search strategy.
1 hypertension/ or antihypertensive agents/
2 deprescriptions/
3 1 and 2
4 ("39209778" or "39094592" or "28437544" or "26591140" or "29678989").ui.
5 3 and 4
4.14 subheading
Publications in bibliographic databases are often indexed using index terms. In cases where the index term does not sufficiently describe the content of the article, so-called subheadings or qualifiers can be applied to the index term to specify certain aspects.
The PubMed search query Respiratory Tract Infections/drug therapy[MeSH]
(or short Respiratory Tract Infections/dt[mh]
) retrieves references focusing on the drug therapy of respiratory tract infections.
Typical subheadings are ´adverse effects´´, ´complications´, ´diagnosis´, ´drug therapy´, ´economics´, ´epidemiology´, ´history´, ´methods´, ´pathology´, ´pharmacology´, ´psychology´, ´standards´, ´therapeutic use´, ´toxicology´.
Sometimes it may be required to retrieve all publications featuring a certain subheading regardless of the index terms, for instance when one is interested in reports of adverse effects. (Golder et al. (2006)) In order to do that the subheading can be searched alone as a so-called floating subheading, for example adverse effects[sh]
in PubMed or ae.fs
in Ovid MEDLINE.
For more information about subheadings in PubMed see https://pubmed.gov/help/#mesh-subheadings or https://nlm.nih.gov/mesh/qualifiers_scopenotes.html.
4.15 systematic literature searching
Conducting a systematic literature search (also called systematic search) means to search multiple databases for references to relevant articles and studies in a structured, planned and iterative manner. A systematic search is well-documented, transparent and (to a certain degree) reproducible.
The purpose of a systematic literature search is to find relevant literature in the desired degree of completeness. For a systematic review the search should be very sensitive as to find all the relevant evidence. In contrast, the literature search as part of a rapid review or narrative review may still be done systematically, but can be designed in a more precise manner, since in those cases it may be acceptable to miss some relevant articles. (See Section 2.7 about sensitivity and precision).
There are various guidelines which can be followed to conduct systematic searches, such as the Cochrane Handbook (Higgins et al. (2022)) or the JBI Manual (Aromataris and Munn (2020)). Depending on the desired type of review (e.g. systematic, scoping, narrative, …, see Munn, Peters, et al. (2018), Munn, Stern, et al. (2018) or Pearson et al. (2015))
- definition of a research question
- selection of databases
- compilation of search terms
- setup of an initial search
- iterative refinement of the search strategy
- peer review of the search strategy
- translation to other databases
- final search and export of results
- documentation of the search
Various methods are applied in the design of search strategies. Keywords and index terms are connected using operators. The search queries are structured by nesting. The precision and sensitivity of the search is adjusted using truncation, by the choice of search fields or the use of term explosion.
A systematic search is best peer-reviewed by colleagues (e.g. by information specialists, other researchers, or fellow students), following the PRESS guideline by McGowan et al. (2016).
Ideally, the systematic search is documented according the PRISMA-S extension by Rethlefsen et al. (2021).
4.16 term explosion
Index terms in databases are usually hierarchically organized, starting with very broad terms in the top categories to very specific terms further down the branches of the hierarchy.
When searching using an index term, it is often possible to include all subordinate index terms in the search. This simultaneous search option is called term explosion.
Keep in mind that depending on the search interface or database, the term explosion might be active by default, as it is the case in PubMed. This can be avoided by using the field codes [mh:noexp]
or [majr:noexp]
instead of [mh]
or [majr]
.
The PubMed search query
Psychotherapy[MH]
automatically retrieves records featuring the MeSH termpsychotherapy
as well asaromatherapy
,behaviour therapy
,crisis intervention
,hypnosis
, orlogotherapy
, because in PubMed the term explosion is active by default. However, the queryPsychotherapy[MH:NoExp]
retrieves only records indexed withpsychotherapy
.In Ovid MEDLINE, the search for
psychotherapy/
looks only for references indexed withpsychotherapy
, whereas the queryexp psychotherapy/
searches the exploded term, i.e. it uses the subordinate index terms such asaromatherapy
,crisis intervention
, etc. in the search, too.
4.17 truncation
In general, to truncate means to reduce something down to its trunk, to shorten something, to cut something off.
In literature searching truncation is a technique that allows to search several variations of a term at once by searching with a reduced (truncated) version of the search term. A term is truncated by replacing one or more characters with a so-called wildcard character, such as *
, $
, ?
, #
.
The most common form of truncation is the unlimited truncation (usually using the asterisk *
or the dollar sign $
) at the end of the search term. In some databases it is possible to limit the truncation to a fixed number of characters or to replace just one character. The kinds of truncation that are available depend on the syntax of the database or search interface.
query | covered search terms |
---|---|
discolo$ | discolor, discolors, discolored, discolouring, discolorations, … |
discolo$3 | discolor, discolors, discolored |
colo$r | color, colour, colorimeter, colonizer, coloarticular, … |
colo?r | color, colour |
wom#n | woman, women, womon, womxn, womyn |
Usually only free text terms are truncated. While it might be possible to truncate index terms, it is not very practical; truncated index terms are rarely used meaningfully, for instance in valiated search filters, such as the search term diagnostic*[MeSH:noexp]
in the diagnosis filter by Kastner et al. (2009).