- Bibliometric analysis is a widely used method for explorative and analytical studies of large volumes of research data.
- The analysis is helpful in discovering various evolutionary variations in a specific field of study as well as highligting emerging topics in the field.
- Bibliometrics is the application of quantitative analysis and statistics to publications such as journal articles and their accompanying citation counts. (https://en.wikipedia.org/wiki/Bibliometrix)
- Various methods are used to analyse the publication data to evaluate growth, maturity, leading authors, conceptual stuctures, trends, topical evolution etc.
R’s package ecosystem is one of its major advantages, there are packages available for most widely used statistical and data analysis & visualisation techniques used several packages added almost daily on new and upcoming methods published by academic researchers or industry practitioners.
R provide packages for various areas of interest (see https://cran.r-project.org/web/views/ for a list of task views grouping packages according to their functionality ) including systematic literature review or the related field of meta analysis.
Bibliometrix (Aria & Cuccurullo (2017)), Revtools (Westgate (2018)) and Litsearchr (E. Grames, Stillman, Tingley, & Elphick (2019),E. M. Grames, Stillman, Tingley, & Elphick (2019)) of the Metaverse (https://rmetaverse.github.io/) project, Adjutant (Crisan, Munzner, & Gardy (2018)), Metagear (Lajeunesse (2016))) are a few providing various functionality.
Bibliometrix is by far the most popular with several publications using the package
The package webpage (http://www.bibliometrix.org/Papers.html) provides a list of publications utilising the package. (for example see, Lajeunesse (2016); Addor & Melsen (2019)) and hence we will use the package to demonstrate some of its functionality.
Bibliometrix (https://www.bibliometrix.org/) allows R users to import a bibliography database generated using SCOPUS and Web of Science stored either as a Bibtex (.bib) or Plain Text (.txt) file.
The package has simple functions which allows for descriptive analyses as shown in table-1 to table-3.
The analysis can also be easily visualised as shown in figure-17.1 to 17.5.
library(bibliometrix) #load the package library(pander) #other required packages library(knitr) library(kableExtra) library(ggplot2) library(bibliometrixData) # use scopuscollection data from the package data("scientometrics") # M=convert2df(file='scopus.bib',format='bibtex',dbsource = 'scopus')#convert # external data to data frame
# Descriptive analysis = scientometrics #just to reuse the other code M = biblioAnalysis(M, sep = ";") res1 = summary(res1, k = 10, pause = FALSE, verbose = FALSE) s1 = s1$MainInformationDF #main information d1 = s1$MostProdAuthors #Most productive Authors d2 = s1$MostCitedPapers #most cited papers d3 pander(d1, caption = "Summary Information")
|MAIN INFORMATION ABOUT DATA|
|Sources (Journals, Books, etc)||1|
|Average years from publication||14.1|
|Average citations per documents||14.81|
|Average citations per year per doc||0.8168|
|article; proceedings paper||19|
|Keywords Plus (ID)||392|
|Author’s Keywords (DE)||342|
|Authors of single-authored documents||32|
|Authors of multi-authored documents||237|
|Documents per Author||0.546|
|Authors per Document||1.83|
|Co-Authors per Documents||2.29|
pander(d3, caption = "Most Cited Papers")
|BOYACK KW, 2005, SCIENTOMETRICS||283||15.72||3.997|
|SMALL H, 1985, SCIENTOMETRICS-a||148||3.89||1.065|
|VAN ECK NJ, 2010, SCIENTOMETRICS||142||10.92||5.004|
|SMALL H, 1985, SCIENTOMETRICS||130||3.42||0.935|
|SMALL H, 2006, SCIENTOMETRICS||83||4.88||3.487|
|GMUR M, 2003, SCIENTOMETRICS||78||3.90||2.806|
|ZITT M, 1994, SCIENTOMETRICS||60||2.07||2.353|
|GLANZEL W, 1996, SCIENTOMETRICS||58||2.15||1.798|
|DING Y, 2000, SCIENTOMETRICS||46||2.00||2.667|
|PONZI LJ, 2002, SCIENTOMETRICS||44||2.10||1.234|
= plot(res1, pause = FALSE)p1
- Analysis of the conceptual structure among the articles analysed.
- Bibliomentrix can conduct a co-word analysis to map the conceptual structure of a framework using the word co-occurrences in a bibliographic database.
- The analysis in Figure-2 is conducted using the Correspondence Analysis and K-Means clustering using Author’s keywords. This analysis includes Natural Language Processing and is conducted without stemming.
library(gridExtra) = conceptualStructure(M, field = "DE", method = "CA", minDegree = 4, clust = "auto", CS stemming = FALSE, labelsize = 8, documents = 10, graph = FALSE) grid.arrange(CS[], CS[], ncol = 2, nrow = 1)
= biblioNetwork(M, analysis = "co-occurrences", network = "keywords", Netmatrix2 sep = ";") # Plot the network = networkPlot(Netmatrix2, normalize = "association", weighted = T, n = 50, Title = "Keyword Co-occurrences", net type = "fruchterman", size = T, edgesize = 5, labelsize = 0.7)
Co-word analysis draws clusters of keywords. They are considered as themes, whose density and centrality can be used in classifying themes and mapping in a two-dimensional diagram.
Thematic map is a very intuitive plot and we can analyze themes according to the quadrant in which they are placed: (1) upper-right quadrant: motor-themes; (2) lower-right quadrant: basic themes; (3) lower-left quadrant: emerging or disappearing themes; (4) upper-left quadrant: very specialized/niche themes.
= thematicMap(M, field = "ID", n = 1000, minfreq = 5, stemming = FALSE, size = 0.5, Map n.labels = 4, repel = TRUE) plot(Map$map)
Finally there is a shiny based GUI also available