DataScience+ An online community for showcasing R & Python tutorials. It operates as a networking platform for data scientists to promote their talent and get hired. Our mission is to empower data scientists by bridging the gap between talent and opportunity.
Programming

Creating Reporting Template with Glue in R

Report Generation is a very important part in any Organization’s Business Intelligence and Analytics Division. The ability to create automated reports out of the given data is one of the most desirable things, that any innovative team would thrive for. And that is one area where SAS is considered to be more matured than R – not because R does not have those features – but primarily because R practitioners are not familiar with those. That’s the same feeling I came across today when I stumbled upon this package glue in R, which is a very good and competitive alternative for Reporting Template packages like whisker and brew.

The package can be installed directly from CRAN.

install.packages('glue')

Let us try to put together a very minimal reporting template to output basic information about the given Dataset.

library(glue)
df <- mtcars
msg <- 'Dataframe Info: \n\n This dataset has {nrow(df)} rows and {ncol(df)} columns. \n There {ifelse(sum(is.na(df))>0,"is","are")} {sum(is.na(df))} Missing Value
glue(msg)
Dataframe Info: 
This dataset has 32 rows and 11 columns. 
There are 0 Missing Values.

As in the above code, glue() is the primary function that takes a string with r expressions enclosed in curly braces {} whose resulting value would get concatenated with the given string. Creation of the templatised string is what we have done with msg. This whole exercise wouldn’t make much of a sense if it’s required for only one dataset, rather it serves its purpose when the same code is used for different datasets with no code change. Hence let us try running this on a different dataset – R’s inbuilt iris dataset. Also since we are outputting the count of missing values, let’s manually assign NA for two instances and run the code.

df <- iris
df[2,3] <- NA
df[4,2] <- NA
msg <- 'Dataframe Info: \n\n This dataset has {nrow(df)} rows and {ncol(df)} columns. \n There {ifelse(sum(is.na(df))>0,"is","are")} {sum(is.na(df))} Missing Value
glue(msg)
Dataframe Info: 
This dataset has 150 rows and 5 columns.
There is 2 Missing Values.

That looks fine. But what if we want to report the contents of the dataframe. That’s where coupling glue's glue_data() function with magrittr's %>% operator helps.

library(magrittr)
head(mtcars) %>% glue_data("* {rownames(.)} has {cyl} cylinders and {hp} hp")
* Mazda RX4 has 6 cylinders and 110 hp
* Mazda RX4 Wag has 6 cylinders and 110 hp
* Datsun 710 has 4 cylinders and 93 hp
* Hornet 4 Drive has 6 cylinders and 110 hp
* Hornet Sportabout has 8 cylinders and 175 hp
* Valiant has 6 cylinders and 105 hp

This is just to introduce glue and its possibilities. This could potentially help in automating a lot of Reports and also to start with Exception-based Reporting. The code used in the article can be found on my github.