Science and technology

How to make use of Pandoc to put in writing a analysis paper

This article takes a deep dive into the way to produce a analysis paper utilizing (largely) Markdown syntax. We’ll cowl the way to create and reference sections, figures (in Markdown and LaTeX) and bibliographies. We’ll additionally focus on troublesome instances and why writing them in LaTeX is the correct strategy.

Research

Research papers often include references to sections, figures, tables, and a bibliography. Pandoc by itself can’t simply cross-reference these, however it might probably leverage the pandoc-crossref filter to do the automated numbering and cross-referencing of sections, figures, and tables.

Let’s begin by rewriting an example of an educational research paper initially written in LaTeX and rewrites it in Markdown (and a few LaTeX) with Pandoc and pandoc-crossref.

Adding and referencing sections

Sections are mechanically numbered and should be written utilizing the Markdown heading H1. Subsections are written with subheadings H2-H4 (it’s unusual to want greater than that). For instance, to put in writing a bit titled “Implementation”, write # Implementation , and Pandoc produces three. Implementation (or the corresponding numbered part). The title “Implementation” makes use of heading H1 and declares a label that authors can use to seek advice from that part. To reference a bit, kind the @ image adopted by the label of the part and enclose it in sq. brackets: [@sec:implementation].

In this paper, we discover the next instance:

we lack expertise (consistency between TAs, [@sec:implementation]).

Pandoc produces:

we lack expertise (consistency between TAs, Section four).

Sections are numbered mechanically (that is coated within the Makefile on the finish of the article). To create unnumbered sections, kind the title of the part, adopted by -. For instance, ### Designing a recreation for maintainability - creates an unnumbered subsection with the title “Designing a game for maintainability”.

Adding and referencing figures

Adding and referencing a determine is much like referencing a bit and including a Markdown picture:

![Scatterplot matrix](knowledge/scatterplots/RScatterplotMatrix2.png)

The line above tells Pandoc that there’s a determine with the caption Scatterplot matrix and the trail to the picture is knowledge/scatterplots/RScatterplotMatrix2.png. declares the identify that needs to be used to reference the determine.

Here is an instance of a determine reference from the instance paper:

The bins "Enjoy", "Grade" and "Motivation" ([@fig:scatter-matrix]) ...

Pandoc produces the next output:

The bins "Enjoy", "Grade" and "Motivation" (Fig. 1) ...

Adding and referencing a bibliography

Most analysis papers maintain references in a BibTeX database file. In this instance, this file is known as biblio.bib and it comprises all of the references of the paper. Here is what this file seems to be like:

@inproceedings

@inproceedingsreview-gamification-framework,
  Author =       ,
  Publisher =    IEEE,
  Booktitle =    ,
  Doi =          ,
  Keywords =     formal specification;critical video games (computing);design
                  framework;formal design course of;recreation elements;recreation design
                  parts;gamification design frameworks;gamification-based
                  options;Bibliographies;Context;Design
                  methodology;Ethics;Games;Proposals,
  Month =        Sept,
  Pages =        ,
  Title =        ,
  Year =         2015,
  Bdsk-Url-1 =  

...

The first line, @inproceedings{wrigstad2017mastery,, declares the kind of publication (inproceedings) and the label used to seek advice from that paper (wrigstad2017mastery).

To cite the paper with its title, Mastery Learning-Like Teaching with Achievements, kind:

the achievement-driven studying methodology [@wrigstad2017mastery]

Pandoc will output:

the achievement- pushed studying methodology [30]

The paper we’ll produce features a bibliography part with numbered references like these:

Citing a set of articles is simple: Simply cite every article, separating the labeled references utilizing a semi-colon: ;. If there are two labeled references—i.e., SEABORN201514 and gamification-leaderboard-benefits—cite them collectively, like this:

Thus, a very powerful profit is its potential to extend college students' motivation
and engagement [@SEABORN201514;@gamification-leaderboard-benefits].

Pandoc will produce:

Thus, a very powerful profit is its potential to extend college students’ motivation
and engagement [26, 28]

Problematic instances

A standard drawback includes objects that don’t match within the web page. They then float to wherever they match greatest, even when that place will not be the place the reader expects to see it. Since papers are simpler to learn when figures or tables seem near the place they’re talked about, we have to have some management over the place these parts are positioned. For this cause, I like to recommend the usage of the determine LaTeX surroundings, which permits customers to regulate the positioning of figures.

Let’s take the determine instance proven above:

![Scatterplot matrix](knowledge/scatterplots/RScatterplotMatrix2.png)

And rewrite it in LaTeX:

start[t]
includegraphicsknowledge/scatterplots/RScatterplotMatrix2.png
captionlabelScatterplot matrix
finish

In LaTeX, the [t] possibility within the determine surroundings declares that the picture needs to be positioned on the high of the web page. For extra choices, seek advice from the Wikibooks article LaTex/Floats, Figures, and Captions.

Producing the paper

So far, we have coated the way to add and reference (sub-)sections and figures and cite the bibliography—now let’s evaluation the way to produce the analysis paper in PDF format. To generate the PDF, we’ll use Pandoc to generate a LaTeX file that may be compiled to the ultimate PDF. We may even focus on the way to generate the analysis paper in LaTeX utilizing a personalized template and a meta-information file, and the way to compile the LaTeX doc into its ultimate PDF kind.

Most conferences present a .cls file or a template that specifies how papers ought to look; for instance, whether or not they need to use a two-column format and different design therapies. In our instance, the convention supplied a file named acmart.cls.

Authors are typically anticipated to incorporate the establishment to which they belong of their papers. However, this selection was not included within the default Pandoc’s LaTeX template (be aware that the Pandoc template might be inspected by typing pandoc -D latex). To embrace the affiliation, take the default Pandoc’s LaTeX template and add a brand new subject. The Pandoc template was copied right into a file named mytemplate.tex as follows:

pandoc -D latex > mytemplate.tex

The default template comprises the next code:

$if(creator)$
creator
$endif$
$if(institute)$
providecommand[1]
institute$for(institute)$$institute$$sep$ and $endfor$
$endif$

Because the template ought to embrace the creator’s affiliation and e-mail deal with, amongst different issues, we up to date it to incorporate these fields (we made different adjustments as nicely however didn’t embrace them right here because of the file size):

latex
$for(creator)$
    $if(creator.identify)$
        creator
        $if(creator.affiliation)$
            affiliationestablishment
        $endif$
        $if(creator.e-mail)$
            e-mail
        $endif$
    $else$
        $creator$
    $endif$
$endfor$

With these adjustments in place, we should always have the next recordsdata:

  • major.md comprises the analysis paper
  • biblio.bib comprises the bibliographic database
  • acmart.cls is the category of the doc that we should always use
  • mytemplate.tex is the template file to make use of (as an alternative of the default)

Let’s add the meta-information of the paper in a meta.yamlfile:

---
template: 'mytemplate.tex'
documentclass: acmart
classoption: sigconf
title: The impression of opt-in gamification on `` college students' grades in a software program design course
creator:
- identify: Kiko Fernandez-Reyes
  affiliation: Uppsala University
  e-mail: [email protected]
- identify: Dave Clarke
  affiliation: Uppsala University
  e-mail: [email protected]
- identify: Janina Hornbach
  affiliation: Uppsala University
  e-mail: [email protected]
bibliography: biblio.bib
summary: |
  An achievement-driven methodology strives to offer college students extra management over their studying with sufficient flexibility to have interaction them in deeper studying. (extra stuff continues)

include-before: |
  ```
  copyrightyear2018
  acmYear2018
  setcopyright
  acmConference[MODELS '18 Companion]October 14--19, 2018Copenhagen, Denmark
  acmBooktitleACM/IEEE 21th International Conference on Model Driven Engineering Languages and Systems (MODELS '18 Companion), October 14--19, 2018, Copenhagen, Denmark
  acmPriceXX.XX
  acmDOI10.1145/3270112.3270118
  acmISBN978-1-4503-5965-Eight/18/10

  start
  <ccs2012>
  <idea>
  <concept_id>10010405.10010489</concept_id>
  <concept_desc>Applied computing~Education</concept_desc>
  <concept_significance>500</concept_significance>
  </idea>
  </ccs2012>
  finish

  ccsdesc[500]Applied computing~Education

  key phrasesgamification, schooling, software program design, UML
  ```
figPrefix:
  - "Fig."
  - "Figs."
secPrefix:
  - "Section"
  - "Sections"
...

This meta-information file units the next variables in LaTeX:

  • template refers back to the template to make use of (‘mytemplate.tex’)
  • documentclass refers back to the LaTeX doc class to make use of (acmart)
  • classoption refers back to the choices of the category, on this case sigconf
  • title specifies the title of the paper
  • creator is an object that comprises different fields, corresponding to identify, affiliation, and e-mail.
  • bibliography refers back to the file that comprises the bibliography (biblio.bib)
  • summary comprises the summary of the paper
  • include-before is info that needs to be included earlier than the precise content material of the paper; this is named the preamble in LaTeX. I’ve included it right here to indicate the way to generate a pc science paper, however it’s possible you’ll select to skip it
  • figPrefix specifies the way to seek advice from figures within the doc, i.e., what needs to be displayed when one refers back to the determine [@fig:scatter-matrix]. For instance, the present figPrefix produces within the instance The bins "Enjoy", "Grade" and "Motivation" ([@fig:scatter-matrix]) this output: The bins "Enjoy", "Grade" and "Motivation" (Fig. three). If there are a number of figures, the present setup declares that it ought to as an alternative show Figs. subsequent to the determine numbers.
  • secPrefix specifies the way to seek advice from sections talked about elsewhere within the doc (much like figures, described above)

Now that the meta-information is ready, let’s create a Makefile that produces the specified output. This Makefile makes use of Pandoc to supply the LaTeX file, pandoc-crossref to supply the cross-references, pdflatex to compile the LaTeX to PDF, and bibtex to course of the references.

The Makefile is proven beneath:

all: paper

paper:
        @pandoc -s -F pandoc-crossref --natbib meta.yaml --template=mytemplate.tex -N
         -f markdown -t latex+raw_tex+tex_math_dollars+citations -o major.tex major.md
        @pdflatex major.tex &> /dev/null
        @bibtex major &> /dev/null
        @pdflatex major.tex &> /dev/null
        @pdflatex major.tex &> /dev/null

clear:
        rm major.aux major.tex major.log major.bbl major.blg major.out

.PHONY: all clear paper

Pandoc makes use of the next flags:

  • -s to create a standalone LaTeX doc
  • -F pandoc-crossref to utilize the filter pandoc-crossref
  • --natbib to render the bibliography with natbib (you can even select --biblatex)
  • --template units the template file to make use of
  • -N to quantity the part headings
  • -f and -t specify the conversion from and to which format. -t often comprises the format and is adopted by the Pandoc extensions used. In the instance, we declared raw_tex+tex_math_dollars+citations to permit use of raw_tex LaTeX in the midst of the Markdown file. tex_math_dollars permits us to kind math formulation as in LaTeX, and citations permits us to make use of this extension.

To generate a PDF from LaTeX, comply with the rules from bibtex to course of the bibliography:

@pdflatex major.tex &> /dev/null
@bibtex major &> /dev/null
@pdflatex major.tex &> /dev/null
@pdflatex major.tex &> /dev/null

The script comprises @ to disregard the output, and we redirect the file deal with of the usual output and error to /dev/nullin order that we don’t see the output generated from the execution of those instructions.

The ultimate result’s proven beneath. The repository for the article might be discovered on GitHub:

Conclusion

In my opinion, analysis is all about collaboration, dissemination of concepts, and bettering the cutting-edge in no matter subject one occurs to be in. Most laptop scientists and engineers write papers utilizing the LaTeX doc system, which gives wonderful assist for math. Researchers from the social sciences appear to stay to DOCX paperwork.

When researchers from totally different communities write papers collectively, they need to first focus on which format they may use. While DOCX will not be handy for engineers if there may be math concerned, LaTeX could also be troublesome for researchers who lack a programming background. As this text exhibits, Markdown is an easy-to-use language that can be utilized by each engineers and social scientists.

Most Popular

To Top