annotate README @ 2:4c87b4cc1176

Add simple label to output files for IGV display application (it does not handle punctuation in URLs)
author Jim Johnson <jj@umn.edu>
date Mon, 15 Jun 2015 15:22:59 -0500
parents cec60c540546
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
1 Inputs:
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
2
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
3 - A tabular file that contains a column with a peptide sequence and a column with an identifier for a reference sequence
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
4 - fasta files for the reference sequences
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
5 - gff or gtf for mapping the reference sequences to a genome
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
6 - reference genome fasta
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
7
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
8 Ensembl transcript_id files: Homo_sapiens.GRCh37.71.gtf,GRCh37.fa
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
9 transcript gtf+reference
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
10 map peptide to 3-frame translation of transcript
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
11 map to reference genome with ensembl gtf
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
12
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
13 ECGene ec_id files: ECgene_hg18_b1_low.fa,GRCh37.fa
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
14 transcript from ecgene.fa
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
15 map peptide to 3-frame translation of transcript
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
16 map transcript to reference genome with blat
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
17
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
18 Augustus id files: ssc10.2.RNA.hints.augustus.fa, ssc10.2.RNA.hints.augustus.gff
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
19 map peptide to augustus protien fasta
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
20 map to reference genome with GFF3
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
21
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
22 EEJ files: Homo_sapiens.GRCh37.71.gtf,eej_sus_scrofa_core_70_102.fa
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
23 map peptide to eej fasta
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
24 parse id to find exon names and junc_pos
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
25 map to reference genome with exon_id in ensembl GTF
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
26
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
27
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
28 Output:
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
29 a GFF3 file that specifies the position of the peptide in a reference genome
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
30
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
31
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
32 Mapping:
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
33 find transcript in cDNA fasta:
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
34 find transcript in translated fasta:
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
35
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
36
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
37 peptide to transcript:
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
38 translate transcript to animo acid sequence and search for peptide
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
39 tblastn
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
40 Biopython
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
41
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
42 transcript to genome:
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
43 If the fasta id lines contain the genomic mapping, use that
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
44 Map transcript to reference genome with BLAT
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
45 see if peptide cross exon boundaries
cec60c540546 Uploaded
galaxyp
parents:
diff changeset
46