Fork me on GitHub

Using (R) Markdown, Jekyll, & GitHub for a Website


December 10, 2012
Categorized as: R, R-Bloggers, Jekyll, github


Introduction

Markdown has been growing in popularity for writing documents on the web. With the introduction of R Markdown (see also Jeromy Anglim’s post on getting started with R Markdown) and knitr, R Markdown has simplified the publishing of R analysis on the web. I recently converted my website from Wordpress to Jekyll. Jekyll is a “static site generator” and is the framework used by GitHub Pages. You can view the complete source for this website on Github at https://github.com/jbryer/jbryer.github.com.

I have outlined two approaches for integrating R Markdown within the Jekyll framework. The first approach implements a Jekyll Converter that will convert rmd files (the default but configurable) when Jekyll processes the site. The second uses a shell script and R function to convert rmd files to a plain Markdown file that Jekyll can then process. This approach is necessary when using GitHub Pages because user plugins are not supported.

Approach One: Using a Jekyll Converter

First, we need to install RinRuby to call R from Ruby. In the terminal, execute:

gem install rinruby

Create rmarkdown.rb and place it in the _plugins folder. The convert class follows and can be downloaded here.

module Jekyll
	class RMarkdownConverter < Converter
		safe true
		priority :low
		
    def setup
      return if @setup
      require 'rinruby'
      @setup = true
    rescue
      STDERR.puts 'do `gem install rinruby`'
      raise FatalException.new("Missing dependency: rinruby")
    end

    def matches(ext)
      ext =~ /rmd/i
    end
	
    def output_ext(ext)
      '.html'
    end
	
    def convert(content)
      setup
      R.eval "require(knitr)"
      R.assign "content", content
      R.eval "content <- (knitr::knit2html(text = content, fragment.only = TRUE))"
      R.pull "content"
    end
  end
end

In order to use the rmd file extension (see the ext =~ /rmd/i line to change the extension used) you need to specify the markdown file extension in the _config.yml configuration file. Otherwise Jekyll will attempt to process rmd files as plain Markdown files. This also means that you cannot use md file extension for markdown files. See this discussion on StackOverflow.

markdown_ext: markdown

Once created, RMarkdownConverter will convert rmd files to html each time Jekyll runs.

Approach Two: Pre-process R Markdown Files

This approach is necessary for Github Pages since plugins are not supported. Using this approach, we can convert the R Mardown file to plain Markdown using the R script below. The converted Markdown file will be saved in the same directory so that Jekyll can then convert the resulting file. For simplicity, I place the rmarkdown.r function in the root directory of my site (alternatively you can place this in your .Rprofile file in your home directory). I then call rmd.sh (also located in the root directory) to first, determine the directory where the script is be executed from, and two, call the convertRMarkdown function. This function will process all R Markdown files (.rmd by default) in the current working directory (which can be set explicitly with the dir parameter or by the rmd.sh script) and convert them to plain markdown (with .markdown file extension by default). Once converted, Jekyll will the process the resulting file(s). This file can be downloaded here.

#' This R script will process all R mardown files (those with in_ext file extention,
#' .rmd by default) in the current working directory. Files with a status of
#' 'processed' will be converted to markdown (with out_ext file extention, '.markdown'
#' by default). It will change the published parameter to 'true' and change the
#' status parameter to 'publish'.
#' 
#' @param dir the directory to process R Markdown files.
#' @param out_ext the file extention to use for processed files.
#' @param in_ext the file extention of input files to process.
#' @param recursive should rmd files in subdirectories be processed.
#' @return nothing.
#' @author Jason Bryer <jason@bryer.org>
convertRMarkdown <- function(dir=getwd(), images.dir=dir, images.url='/images/',
           out_ext='.markdown', in_ext='.rmd', recursive=FALSE) {
  require(knitr, quietly=TRUE, warn.conflicts=FALSE)
  files <- list.files(path=dir, pattern=in_ext, ignore.case=TRUE, recursive=recursive	)
  for(f in files) {
    message(paste("Processing ", f, sep=''))
    content <- readLines(f)
    frontMatter <- which(substr(content, 1, 3) == '---')
    if(length(frontMatter) == 2) {
      statusLine <- which(substr(content, 1, 7) == 'status:')
      publishedLine <- which(substr(content, 1, 10) == 'published:')
      if(statusLine > frontMatter[1] & statusLine < frontMatter[2]) {
        status <- unlist(strsplit(content[statusLine], ':'))[2]
        status <- sub('[[:space:]]+$', '', status)
        status <- sub('^[[:space:]]+', '', status)
        if(tolower(status) == 'process') {
          #This is a bit of a hack but if a line has zero length (i.e. a
          #black line), it will be removed in the resulting markdown file.
          #This will ensure that all line returns are retained.
          content[nchar(content) == 0] <- ' '
          message(paste('Processing ', f, sep=''))
          content[statusLine] <- 'status: publish'
          content[publishedLine] <- 'published: true'
          outFile <- paste(substr(f, 1, (nchar(f)-(nchar(in_ext)))), out_ext, sep='')
          render_markdown(strict=TRUE)
          opts_knit$set(out.format='markdown')
          opts_knit$set(base.dir=images.dir)
          opts_knit$set(base.url=images.url)
          try(knit(text=content, output=outFile), silent=FALSE)
        } else {
          warning(paste("Not processing ", f, ", status is '", status, 
                  "'. Set status to 'process' to convert.", sep=''))
        }
      } else {
        warning("Status not found in front matter.")
      }
    } else {
      warning("No front matter found. Will not process this file.")
    }
  }
  invisible()
}

Here is the source to the rmd.sh shell script for calling the convertRMarkdown function. This file can be downloaded here.

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
Rscript -e "source('$DIR/rmarkdown.r'); convertRMarkdown(images.dir='$DIR/images')"
YAML Front Matter

There are two parameters you can specify in the YAML Front Matter to alter how the convertRMarkdown function handles particular files. First, the published parameter should be set to false so that Jekyll will not attempt to process the file. The convertRMarkdown function will change this parameter to true in the resulting Markdown file. The second parameter, status, must be set to process for the convertRMarkdown function to convert the file. This is useful when working a draft of a document and you wish to not have the file converted.

published: false
status: process

Lastly, one difficulty with Jekyll is the inclusion of images in posts. The default behavior assumes that all images will be saved in the /images directory. This can of course be configured using parameters to convertRMarkdown and knitr options.

Example

The source for this post can be downloaded from GitHub. In this example we will analyze the reading attitude items for North America from the Programme of International Student Assessment using the likert package. The first R chuck will load and recode the data.

require(likert)
data(pisana)

## Warning: data set 'pisana' not found

items <- pisana[,c(
	'ST24Q01', #Only if I have to
	'ST24Q02', #Favourite hobbies
	'ST24Q03', #Talk about books
	'ST24Q04', #Hard to finish
	'ST24Q05', #Happy as present
	'ST24Q06', #Waste of time
	'ST24Q07', #Enjoy library
	'ST24Q08', #Need information
	'ST24Q09', #Cannot sit still
	'ST24Q10', #Express opinions
	'ST24Q11'  #Exchange
	)]

## Error: object 'pisana' not found

names(items) <- c("I read only if I have to.",
		"Reading is one of my favorite hobbies.",
		"I like talking about books with other people.",
		"I find it hard to finish books.",
		"I feel happy if I receive a book as a present.",
		"For me, reading is a waste of time.",
		"I enjoy going to a bookstore or a library.",
		"I read only to get information that I need.",
		"I cannot sit still and read for more than a few minutes.",
		"I like to express my opinions about books I have read.",
		"I like to exchange books with my friends")

## Error: object 'items' not found

for(i in 1:ncol(items)) {
	items[,i] <-  factor(items[,i], levels=c(1,2,3,4), ordered=TRUE,
		labels=c('Strongly Disagree', 'Disagree', 'Agree', 'Strongly Agree'))
}

## Error: object 'items' not found

l <- likert(items, grouping=pisana$CNT)

## Error: object 'items' not found

Once the likert has been called we can print the summary.

options(width=120)
summary(l)

Error: object 'l' not found

And of course, we can include plots.

plot(l, centered=TRUE)

## Error: object 'l' not found

Final Thoughts

The conversion from Wordpress wasn’t necessarily trivial, but the benefits of using Jekyll have made the conversion worth while. The ability to embed R code within the site’s content makes writing posts about R much easier than executing R code, copy and paste to Wordpress (or Gists), and publishing in a database back system for a site that changes relatively infrequently. I will soon be publishing results from a large study and this exercise has shown that R Markdown is an ideal solution.

Laslty, I must give a big thanks to Tal Galili who maintains R-Bloggers for his help and patience as I worked out the issues getting the RSS feed to work with his platform.

comments powered by Disqus

= Github page; = RSS XML Feed; = External website; = Portable Document File (PDF)
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Creative Commons License
Formulas powered by MathJax