view fastqc_report.Rmd @ 14:2efa46ce2c4c draft

upgrade fastqc_report
author mingchen0919
date Wed, 18 Oct 2017 22:06:39 -0400
parents e629c2288316
children d1d20f341632
line wrap: on
line source

---
title: 'HTML report title'
output:
    html_document:
      number_sections: true
      toc: true
      theme: cosmo
      highlight: tango
---

```{r setup, include=FALSE, warning=FALSE, message=FALSE}
knitr::opts_chunk$set(
  echo = ECHO
)
```


# Fastqc Analysis

* Copy fastq files to job working directory

```{bash 'copy files'}
for f in $(echo READS | sed "s/,/ /g")
do
    cp $f ./
done
```

* Run fastqc

```{bash 'run fastqc'}
for r in $(ls *.dat)
do
    fastqc -o REPORT_DIR $r > /dev/null 2>&1
done
```

* Create links to original HTML reports

```{r 'html report links'}
html_report_list = list()
html_files = list.files('REPORT_DIR', pattern = '.*html')
for (i in html_files) {
  html_report_list[[i]] = tags$li(tags$a(href=i, i))
}
tags$ul(html_report_list)
```

# Fastqc output summary

* Define a function to extract outputs for each module from fastqc output

```{r 'function definition'}
extract_data_module = function(fastqc_data, module_name) {
  f = readLines(fastqc_data)
  start_line = grep(module_name, f)
  end_module_lines = grep('END_MODULE', f)
  end_line = end_module_lines[which(end_module_lines > start_line)[1]]
  module_data = f[(start_line+1):(end_line-1)]
  writeLines(module_data, 'temp.txt')
  read.csv('temp.txt', sep = '\t')
}
```

## 

# Session Info

```{r 'session info'}
sessionInfo()
```