Next changeset 1:046085ec595c (2020-09-16) |
Commit message:
"planemo upload for repository https://github.com/QFAB-Bioinformatics/metaDEGalaxy/tree/master/uc2otutable commit 0db3cb4e9a87400bb2f8402ffc23334e24ad4b4e-dirty" |
added:
test-data/uc_input.txt test-data/uc_output.txt uclust2otutable.py uclust2otutable.xml |
b |
diff -r 000000000000 -r e85e7ba38aff test-data/uc_input.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/uc_input.txt Mon Sep 14 04:52:36 2020 +0000 |
b |
b'@@ -0,0 +1,5000 @@\n+H\t185016\t1365\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10011:3881_F3D0\t794170\n+H\t504265\t1397\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10019:15796_F3D147\t297768\n+H\t596361\t252\t97.6\t+\t0\t0\t531I252M589I\tM00967:43:000000000-A3JHG:1:1101:10029:4641_F3D6\t182733\n+H\t467122\t1408\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10025:3791_F3D7\t345662\n+H\t504265\t252\t99.6\t+\t0\t0\t522I252M623I\tM00967:43:000000000-A3JHG:1:1101:10032:19711_F3D1\t297768\n+H\t480352\t252\t99.2\t+\t0\t0\t493I252M621I\tM00967:43:000000000-A3JHG:1:1101:10037:16380_F3D147\t328698\n+H\t504265\t1397\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10044:19764_F3D1\t297768\n+H\t186862\t1359\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10049:25981_F3D147\t789537\n+H\t169328\t1394\t99.6\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10050:10579_F3D144\t835517\n+H\t195264\t252\t98.0\t+\t0\t0\t511I252M603I\tM00967:43:000000000-A3JHG:1:1101:10050:15564_F3D0\t768418\n+H\t525636\t1371\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10051:26098_F3D0\t271934\n+H\t532742\t252\t99.6\t+\t0\t0\t526I252M587I\tM00967:43:000000000-A3JHG:1:1101:10081:16140_F3D148\t263472\n+H\t216378\t253\t98.8\t+\t0\t0\t484I253M604I\tM00967:43:000000000-A3JHG:1:1101:10082:18564_F3D148\t715297\n+H\t480352\t1366\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10089:26006_F3D144\t328698\n+H\t475035\t253\t99.2\t+\t0\t0\t507I253M599I\tM00967:43:000000000-A3JHG:1:1101:10090:19695_F3D144\t335523\n+H\t181833\t1352\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10095:25435_F3D1\t802273\n+H\t582323\t1334\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10116:22122_F3D149\t201490\n+H\t557013\t1380\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10096:5628_F3D149\t231806\n+H\t195264\t1366\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10120:13996_F3D2\t768418\n+H\t467122\t1408\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10120:18943_F3D141\t345662\n+H\t36315\t1364\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10133:12842_F3D6\t1051764\n+H\t911030\t1355\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10133:8460_F3D0\t641881\n+H\t1014062\t253\t98.4\t+\t0\t0\t510I253M607I\tM00967:43:000000000-A3JHG:1:1101:10134:24617_F3D1\t764495\n+H\t504265\t1397\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10150:11076_F3D147\t297768\n+H\t257482\t253\t99.6\t+\t0\t0\t484I253M605I\tM00967:43:000000000-A3JHG:1:1101:10156:19311_F3D1\t611419\n+H\t557013\t252\t99.6\t+\t0\t0\t414I252M714I\tM00967:43:000000000-A3JHG:1:1101:10164:7088_F3D1\t231806\n+H\t557013\t1380\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10174:10502_F3D147\t231806\n+H\t557013\t1380\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10187:15906_F3D148\t231806\n+H\t185044\t252\t99.6\t+\t0\t0\t511I252M607I\tM00967:43:000000000-A3JHG:1:1101:10187:5493_F3D146\t794112\n+H\t467122\t252\t97.2\t+\t0\t0\t533I252M623I\tM00967:43:000000000-A3JHG:1:1101:10188:16313_F3D147\t345662\n+H\t557013\t1380\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10190:13338_F3D141\t231806\n+H\t258309\t1341\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10190:21000_F3D9\t608864\n+H\t185239\t253\t99.6\t+\t0\t0\t496I253M605I\tM00967:43:000000000-A3JHG:1:1101:10203:26904_F3D149\t793598\n+H\t558934\t253\t98.8\t+\t0\t0\t513I253M707I\tM00967:43:000000000-A3JHG:1:1101:10216:10470_F3D142\t229622\n+H\t557013\t1380\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10207:9140_F3D144\t231806\n+H\t195126\t1353\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10237:26445_F3D145\t768762\n+H\t201774\t253\t99.6\t+\t0\t0\t497I253M601I\tM00967:43:000000000-A3JHG:1:1101:10224:3196_F3D149\t752203\n+H\t185044\t1370\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10238:25436_F3D6\t794112\n+H\t735713\t1353\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10238:8952_F3D6\t803937\n+H\t532742\t1365\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10239:11185_F3D2\t263472\n+H\t587328\t252\t99.6\t+\t0\t0\t527I252M620I\tM00967:43:000000000-A3JHG:1:1101:10242:16345_F3D144\t194043\n+H\t558934\t253\t99.6\t+\t0\t0\t513I253M707I\tM00967:43:000000000-A3JHG:1:1101:10244:17119_F3D8\t229622\n+H\t475035\t1359\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10242:4421_F3D148\t335523\n+H\t532742\t1365\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1101:10245:17643_F3D148\t263472\n+H\t8'..b'H\t475035\t1359\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12699:16017_F3D6\t335523\n+H\t560581\t253\t99.2\t+\t0\t0\t510I253M708I\tM00967:43:000000000-A3JHG:1:1102:12702:13662_F3D142\t227754\n+H\t185016\t1365\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12707:22192_F3D143\t794170\n+H\t185044\t1370\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12708:25601_F3D6\t794112\n+H\t184332\t1351\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12711:13617_F3D2\t795920\n+H\t582323\t1334\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12715:16467_F3D148\t201490\n+H\t558934\t253\t99.6\t+\t0\t0\t513I253M707I\tM00967:43:000000000-A3JHG:1:1102:12716:14688_F3D5\t229622\n+H\t589853\t1386\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12723:20496_F3D144\t190916\n+H\t558934\t253\t99.6\t+\t0\t0\t513I253M707I\tM00967:43:000000000-A3JHG:1:1102:12727:21955_F3D2\t229622\n+H\t475035\t1359\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12742:1987_F3D9\t335523\n+H\t504265\t1397\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12744:8397_F3D145\t297768\n+H\t243563\t252\t99.6\t+\t0\t0\t512I252M603I\tM00967:43:000000000-A3JHG:1:1102:12755:24081_F3D2\t646449\n+H\t557013\t1380\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12757:18823_F3D144\t231806\n+H\t1145983\t253\t99.6\t+\t0\t0\t516I253M623I\tM00967:43:000000000-A3JHG:1:1102:12759:12847_F3D2\t4387364\n+H\t504265\t1397\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12762:22012_F3D145\t297768\n+H\t572216\t252\t98.4\t+\t0\t0\t534I252M721I\tM00967:43:000000000-A3JHG:1:1102:12767:6517_F3D0\t214613\n+H\t475035\t1359\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12767:8558_F3D8\t335523\n+H\t487757\t1339\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12774:17720_F3D9\t319184\n+H\t474772\t1349\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12793:12548_F3D3\t335855\n+H\t1010527\t252\t99.6\t+\t0\t0\t494I252M606I\tM00967:43:000000000-A3JHG:1:1102:12797:25166_F3D146\t787376\n+H\t491470\t1364\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12799:6553_F3D2\t314425\n+H\t412546\t1391\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12799:20059_F3D5\t407617\n+H\t185260\t1350\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12801:3041_F3D6\t793544\n+H\t186862\t1359\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12800:16630_F3D149\t789537\n+H\t532742\t1365\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12807:25647_F3D2\t263472\n+H\t532742\t1365\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12814:9711_F3D2\t263472\n+H\t504265\t1397\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12815:11274_F3D0\t297768\n+H\t480352\t252\t99.6\t+\t0\t0\t493I252M621I\tM00967:43:000000000-A3JHG:1:1102:12825:19212_F3D2\t328698\n+H\t557013\t1380\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12830:26833_F3D143\t231806\n+H\t480352\t1366\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12832:9075_F3D8\t328698\n+H\t252433\t253\t99.2\t+\t0\t0\t499I253M598I\tM00967:43:000000000-A3JHG:1:1102:12836:12692_F3D1\t624351\n+H\t557013\t1380\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12852:10459_F3D149\t231806\n+H\t475035\t1359\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12853:26239_F3D148\t335523\n+H\t369344\t1363\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12854:9143_F3D143\t453937\n+H\t185044\t1370\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12863:27276_F3D146\t794112\n+H\t184332\t1351\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12876:8675_F3D5\t795920\n+H\t369344\t1363\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12880:17276_F3D146\t453937\n+H\t577593\t252\t99.6\t+\t0\t0\t527I252M720I\tM00967:43:000000000-A3JHG:1:1102:12880:24745_F3D0\t208402\n+H\t504265\t1397\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12900:6040_F3D7\t297768\n+H\t530845\t253\t99.6\t+\t0\t0\t527I253M592I\tM00967:43:000000000-A3JHG:1:1102:12892:5956_F3D2\t265712\n+H\t504265\t1397\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12906:21436_F3D2\t297768\n+H\t586844\t253\t99.2\t+\t0\t0\t526I253M624I\tM00967:43:000000000-A3JHG:1:1102:12901:7843_F3D5\t194662\n+H\t480352\t1366\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12907:15280_F3D7\t328698\n+H\t185016\t1365\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12911:16413_F3D0\t794170\n+H\t557013\t1380\t100.0\t+\t0\t0\t=\tM00967:43:000000000-A3JHG:1:1102:12919:17773_F3D149\t231806\n' |
b |
diff -r 000000000000 -r e85e7ba38aff test-data/uc_output.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test-data/uc_output.txt Mon Sep 14 04:52:36 2020 +0000 |
b |
b'@@ -0,0 +1,524 @@\n+OTUId\tF3D0\tF3D147\tF3D6\tF3D7\tF3D1\tF3D144\tF3D148\tF3D149\tF3D2\tF3D141\tF3D146\tF3D9\tF3D142\tF3D145\tF3D8\tF3D3\tF3D143\tF3D5\n+794170\t6\t13\t3\t1\t1\t5\t12\t6\t3\t5\t1\t0\t1\t4\t4\t0\t2\t2\n+297768\t18\t55\t33\t17\t18\t14\t29\t34\t56\t14\t10\t20\t8\t17\t12\t15\t6\t10\n+182733\t0\t0\t1\t0\t0\t0\t0\t2\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\n+345662\t3\t31\t4\t7\t5\t3\t12\t18\t19\t5\t2\t10\t3\t8\t3\t13\t6\t4\n+328698\t14\t32\t16\t23\t7\t9\t26\t26\t46\t14\t11\t12\t6\t11\t16\t24\t12\t10\n+789537\t0\t19\t0\t0\t0\t2\t13\t6\t4\t1\t2\t1\t0\t5\t1\t0\t1\t2\n+835517\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+768418\t1\t2\t1\t1\t0\t1\t0\t5\t4\t2\t0\t3\t1\t1\t1\t1\t1\t2\n+271934\t5\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t2\t0\t0\t0\t1\t0\n+263472\t14\t48\t35\t21\t16\t12\t31\t33\t118\t15\t20\t17\t4\t20\t11\t33\t5\t14\n+715297\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+335523\t3\t20\t24\t21\t8\t3\t24\t14\t16\t4\t10\t22\t7\t23\t29\t16\t6\t5\n+802273\t0\t0\t0\t1\t3\t1\t0\t1\t5\t0\t0\t1\t0\t0\t2\t1\t0\t1\n+201490\t5\t10\t0\t0\t0\t4\t13\t8\t1\t8\t3\t0\t3\t1\t0\t0\t1\t2\n+231806\t22\t42\t0\t1\t3\t11\t24\t21\t5\t15\t6\t0\t9\t17\t0\t1\t13\t0\n+1051764\t9\t10\t12\t15\t3\t7\t6\t8\t20\t5\t2\t4\t2\t8\t7\t20\t2\t8\n+641881\t1\t0\t1\t0\t2\t0\t0\t0\t6\t0\t0\t4\t0\t0\t3\t0\t0\t0\n+764495\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+611419\t1\t1\t1\t1\t4\t1\t2\t0\t13\t0\t0\t3\t0\t0\t1\t0\t0\t2\n+794112\t8\t40\t14\t14\t3\t10\t27\t33\t14\t13\t17\t7\t2\t20\t7\t9\t7\t7\n+608864\t1\t1\t0\t1\t2\t0\t0\t0\t1\t0\t0\t4\t1\t0\t1\t0\t0\t1\n+793598\t0\t1\t1\t0\t0\t1\t0\t3\t3\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+229622\t6\t6\t12\t7\t8\t2\t17\t21\t47\t15\t2\t17\t8\t4\t7\t19\t8\t9\n+768762\t1\t0\t1\t0\t0\t0\t2\t2\t1\t3\t0\t1\t0\t1\t1\t0\t1\t1\n+752203\t0\t0\t0\t0\t1\t0\t0\t1\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+803937\t0\t0\t1\t1\t4\t0\t2\t2\t1\t0\t0\t3\t0\t0\t0\t0\t0\t0\n+194043\t0\t3\t0\t0\t0\t1\t3\t1\t0\t0\t0\t0\t0\t3\t0\t0\t1\t0\n+636535\t2\t2\t1\t1\t10\t1\t1\t0\t10\t1\t0\t3\t0\t2\t2\t1\t2\t3\n+783396\t12\t23\t10\t6\t1\t6\t18\t15\t11\t7\t4\t5\t4\t15\t3\t6\t7\t4\n+315669\t0\t1\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+795920\t1\t1\t3\t3\t5\t0\t0\t0\t8\t0\t1\t2\t2\t0\t6\t0\t0\t3\n+565140\t1\t17\t2\t0\t2\t9\t15\t12\t2\t6\t10\t0\t4\t11\t2\t6\t0\t3\n+2096434\t1\t3\t0\t0\t0\t0\t1\t1\t0\t3\t0\t0\t2\t0\t0\t0\t1\t0\n+732155\t2\t0\t1\t2\t0\t0\t1\t1\t1\t0\t0\t0\t0\t0\t1\t0\t0\t0\n+216524\t1\t1\t0\t0\t0\t0\t1\t2\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\n+781109\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+790211\t1\t0\t1\t2\t0\t0\t0\t0\t5\t0\t0\t1\t0\t0\t0\t0\t0\t1\n+453937\t12\t26\t0\t0\t2\t3\t21\t11\t3\t6\t4\t0\t4\t10\t0\t0\t2\t0\n+214906\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\n+273631\t4\t0\t2\t0\t3\t1\t0\t1\t4\t2\t0\t0\t0\t0\t2\t0\t2\t4\n+795350\t1\t0\t1\t0\t0\t0\t0\t2\t1\t0\t0\t0\t0\t0\t0\t0\t1\t0\n+208402\t2\t1\t1\t1\t0\t1\t0\t1\t1\t2\t0\t1\t1\t1\t0\t0\t0\t0\n+367519\t1\t0\t0\t0\t0\t0\t0\t0\t2\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+796269\t0\t0\t0\t0\t0\t0\t0\t1\t1\t0\t0\t0\t0\t0\t0\t0\t0\t1\n+724472\t0\t1\t0\t1\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+4406374\t0\t0\t0\t0\t0\t0\t1\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+190916\t5\t27\t0\t5\t0\t4\t20\t6\t7\t11\t2\t0\t5\t11\t1\t3\t2\t5\n+4444771\t1\t1\t5\t0\t3\t1\t0\t0\t17\t0\t0\t0\t0\t0\t0\t15\t0\t0\n+744185\t0\t3\t2\t0\t0\t0\t1\t0\t4\t1\t1\t1\t0\t0\t0\t0\t0\t1\n+793544\t2\t4\t3\t0\t0\t1\t1\t0\t8\t0\t1\t2\t1\t1\t1\t0\t0\t1\n+309960\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+784569\t2\t3\t14\t3\t2\t0\t0\t1\t7\t0\t0\t5\t2\t2\t8\t2\t3\t3\n+739969\t1\t3\t0\t0\t0\t0\t6\t13\t0\t3\t4\t0\t0\t0\t0\t0\t3\t1\n+356404\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\n+1680788\t0\t0\t1\t0\t1\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+262166\t0\t2\t1\t0\t0\t1\t1\t3\t2\t0\t1\t0\t0\t0\t0\t1\t0\t0\n+726272\t0\t1\t1\t3\t2\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t1\t0\t0\n+741302\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\n+727165\t0\t0\t0\t0\t0\t0\t1\t2\t0\t1\t1\t0\t0\t0\t0\t0\t0\t0\n+3172943\t0\t3\t0\t2\t0\t0\t2\t2\t1\t1\t1\t1\t0\t1\t1\t1\t0\t1\n+1918929\t0\t0\t1\t1\t0\t0\t0\t0\t2\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+740299\t1\t0\t0\t1\t1\t0\t1\t0\t4\t0\t0\t2\t0\t0\t2\t0\t0\t1\n+778075\t3\t0\t0\t0\t2\t0\t0\t0\t2\t0\t0\t2\t0\t0\t3\t0\t0\t1\n+826302\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\n+702577\t2\t0\t3\t0\t2\t1\t0\t7\t0\t0\t1\t0\t0\t0\t0\t0\t1\t0\n+274697\t0\t3\t0\t0\t0\t2\t1\t1\t0\t0\t1\t0\t0\t0\t0\t0\t1\t0\n+741663\t1\t0\t1\t0\t3\t0\t0\t1\t1\t1\t2\t2\t0\t2\t1\t1\t1\t0\n+134615\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t1\t0\t0\n+4345528\t2\t5\t0\t1\t1\t2\t6\t4\t2\t1\t0\t1\t0\t1\t3\t4\t0\t0\n+767705\t2\t1\t0\t0\t1\t1\t3\t4\t2\t0\t2\t1\t0\t0\t1\t0\t1\t0\n+780452\t2\t0\t0\t0\t0\t1\t0\t2\t2\t0\t0\t1\t0\t0\t2\t0\t1\t4\n+134392\t0\t0\t1\t1\t6\t0\t1\t4\t10\t0\t1\t5\t1\t0\t2\t0\t0\t3\n+735932\t0\t0\t0\t0\t0\t0\t0\t2\t1\t1\t0\t0\t0\t0\t0\t0\t0\t1\n+335855\t0\t0\t1\t0\t1\t0\t0\t0\t1\t1\t0\t2\t0\t0\t1\t1\t0\t0\n+210001\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\n+771005\t0\t2\t1\t0\t1\t0\t2\t0\t0\t0\t0\t2\t1\t2\t0\t0\t1\t3\n+270165\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+199092\t0\t1\t0\t0\t0\t0\t0\t0\t1\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+2916985\t0\t2\t0\t0\t0\t1\t1\t1\t0\t2\t0\t0\t4\t0\t0\t0\t0\t0\n+638030\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+723287\t0\t0\t1\t0\t1\t0\t0\t0\t2\t0\t0\t1\t1\t0\t2\t0\t0\t1\n+301960\t10\t0\t1\t1\t3\t0\t0\t3\t4\t2\t0\t0\t0\t0\t1\t1\t2\t5\n+684456\t0\t0\t0\t0\t1\t0\t0\t1\t1\t0\t2\t0\t0\t0\t1\t0\t0\t0\n+323609\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t1\t0\t0\n+645966\t0\t1\t0\t0\t0\t0\t1\t0\t0\t0\t1\t'..b'\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\n+316700\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\n+182061\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\n+644300\t1\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\n+282225\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\n+786991\t1\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+268927\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+346719\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+129394\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+136604\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\n+627122\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\n+215067\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+262087\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+677141\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+211667\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+183415\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+270497\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+326297\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+628730\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+275796\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+354964\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+252742\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\n+690422\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+4427968\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+264661\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+187078\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+265355\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\n+691711\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+788402\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+191656\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+206435\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\n+799694\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+344351\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\n+626784\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+238857\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+854052\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\n+990122\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\n+259937\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+259884\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+177271\t0\t0\t0\t0\t0\t0\t1\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+271325\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\n+131743\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\n+794246\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\n+209027\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+181133\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+307487\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\n+184966\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\n+212750\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+3293891\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+449898\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+310155\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+657842\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+795839\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+726744\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+269368\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+832036\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+271418\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+320213\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+206173\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+1112425\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+187756\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\n+205116\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\n+610782\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\n+803615\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+267731\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+176118\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+693043\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\n+647976\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\n+176760\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\n+268163\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+356164\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+313274\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+800588\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+205969\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+184567\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+670050\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+209912\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+356339\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+262814\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+334946\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+197452\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+685477\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+204126\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+263374\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+4445339\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\n+267110\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+710391\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+795930\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+646449\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+214613\t1\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\n+194662\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t0\t1\n' |
b |
diff -r 000000000000 -r e85e7ba38aff uclust2otutable.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/uclust2otutable.py Mon Sep 14 04:52:36 2020 +0000 |
[ |
b'@@ -0,0 +1,468 @@\n+import sys\r\n+import progress\r\n+import subprocess\r\n+import tempfile\r\n+import traceback\r\n+import argparse\r\n+\r\n+parser = argparse.ArgumentParser(\r\n+ description="This script converts uclust format from vsearch to tabular format"\r\n+\t)\r\n+parser.add_argument("-v","--version",action="version",version="%(prog)s 1.0")\r\n+parser.add_argument("-i","--input",dest="uclust",default=False,help="input filename in uclust format")\r\n+parser.add_argument("-o","--output",dest="otutable",default=False,help="output filename")\r\n+\r\n+\r\n+if(len(sys.argv) == 1):\r\n+\tparser.print_help(sys.stderr)\r\n+\tsys.exit()\r\n+\r\n+args = parser.parse_args()\r\n+\r\n+ucFileName = args.uclust\r\n+outFileName = args.otutable\r\n+\r\n+\r\n+# Tab-separated fields:\r\n+# 1=Type, 2=ClusterNr, 3=SeqLength or ClusterSize, 4=PctId, 5=Strand, 6=QueryStart, 7=SeedStart, 8=Alignment, 9=Label\r\n+# Record types (field 1): L=LibSeed, S=NewSeed, H=Hit, R=Reject, D=LibCluster, C=NewCluster, N=NotMatched\r\n+# For C and D types, PctId is average id with seed.\r\n+# QueryStart and SeedStart are zero-based relative to start of sequence.\r\n+# If minus strand, SeedStart is relative to reverse-complemented seed.\r\n+\r\n+MaxError = -1\r\n+\r\n+Type = \'?\'\r\n+ClusterNr = -1\r\n+Size = -1\r\n+PctId = -1.0\r\n+LocalScore = -1.0\r\n+Evalue = -1.0\r\n+Strand = \'.\'\r\n+QueryStart = -1\r\n+SeedStart = -1\r\n+Alignment = ""\r\n+QueryLabel = ""\r\n+TargetLabel = ""\r\n+FileName = "?"\r\n+Line = ""\r\n+\r\n+TRUNC_LABELS=0\r\n+\r\n+def GetSampleId(Label):\r\n+\tsep=";"\r\n+\tSampleID_temp = Label.split(sep,1)[0]\r\n+\tSampleID = SampleID_temp.split(\'_\',1)[-1]\r\n+\treturn SampleID\r\n+\r\n+def OnRec():\r\n+\tglobal OTUs, Samples, OTUTable\r\n+\tif Type != \'H\':\r\n+\t\treturn\r\n+\r\n+\tOTUId = TargetLabel\r\n+\tif OTUId not in OTUIds:\r\n+\t\tOTUIds.append(OTUId)\r\n+\t\tOTUTable[OTUId] = {}\r\n+\r\n+\tSampleId = GetSampleId(QueryLabel)\r\n+\tif SampleId not in SampleIds:\r\n+\t\tSampleIds.append(SampleId)\r\n+\r\n+\tN = GetSizeFromLabel(QueryLabel, 1)\r\n+\ttry:\r\n+\t\tOTUTable[OTUId][SampleId] += N\r\n+\texcept:\r\n+\t\tOTUTable[OTUId][SampleId] = N\r\n+\r\n+def Die(Msg):\r\n+\tprint >> sys.stderr\r\n+\tprint >> sys.stderr\r\n+\r\n+\ttraceback.print_stack()\r\n+\ts = ""\r\n+\tfor i in range(0, len(sys.argv)):\r\n+\t\tif i > 0:\r\n+\t\t\ts += " "\r\n+\t\ts += sys.argv[i]\r\n+\tprint >> sys.stderr, s\r\n+\tprint >> sys.stderr, "**ERROR**", Msg\r\n+\tprint >> sys.stderr\r\n+\tprint >> sys.stderr\r\n+\tsys.exit(1)\r\n+\tprint("NOTHERE!!")\r\n+\t\r\n+def Warning(Msg):\r\n+\tprint >> sys.stderr\r\n+\tprint >> sys.stderr, sys.argv\r\n+\tprint >> sys.stderr, "**WARNING**", Msg\r\n+\r\n+def isgap(c):\r\n+\treturn c == \'-\' or c == \'.\'\r\n+\r\n+def GetSeqCount(FileName):\r\n+\tTmp = tempfile.TemporaryFile()\r\n+\ttry:\r\n+\t\tTmpFile = Tmp.file\r\n+\texcept:\r\n+\t\tTmpFile = Tmp\r\n+\ts = subprocess.call([ "grep", "-c", "^>", FileName ], stdout=TmpFile)\r\n+\tTmpFile.seek(0)\r\n+\ts = TmpFile.read()\r\n+\treturn int(s)\r\n+\r\n+def GetSeqsDict(FileName):\r\n+\treturn ReadSeqsFast(FileName, False)\r\n+\r\n+def ReadSeqsDict(FileName, Progress = False):\r\n+\treturn ReadSeqsFast(FileName, Progress)\r\n+\r\n+def ReadSeqsOnSeq(FileName, OnSeq, Progress = False):\r\n+\tReadSeqs3(FileName, OnSeq, Progress)\r\n+\r\n+def ReadSeqsFastFile(File, Progress = False):\r\n+\tSeqs = {}\r\n+\tId = ""\r\n+\tN = 0\r\n+\twhile 1:\r\n+\t\tif N%10000 == 0 and Progress:\r\n+\t\t\tsys.stderr.write("%u seqs\\r" % (N))\r\n+\t\tLine = File.readline()\r\n+\t\tif len(Line) == 0:\r\n+\t\t\tif Progress:\r\n+\t\t\t\tsys.stderr.write("%u seqs\\n" % (N))\r\n+\t\t\treturn Seqs\r\n+\t\tif len(Line) == 0:\r\n+\t\t\tcontinue\r\n+\t\tLine = Line.strip()\r\n+\t\tif Line[0] == ">":\r\n+\t\t\tN += 1\r\n+\t\t\tId = Line[1:]\r\n+\t\t\tif TRUNC_LABELS:\r\n+\t\t\t\tId = Id.split()[0]\r\n+\t\t\tSeqs[Id] = ""\r\n+\t\telse:\r\n+\t\t\tif Id == "":\r\n+\t\t\t\tDie("FASTA file does not start with \'>\'")\r\n+\t\t\tSeqs[Id] = Seqs[Id] + Line\r\n+\r\n+def ReadSeqsFast(FileName, Progress = True):\r\n+\tFile = open(FileName)\r\n+\treturn ReadSeqsFastFile(File, Progress)\r\n+\r\n+def ReadSeqs(FileName, toupper=False, stripgaps=False, Progress=False):\r\n+\tif not toupper and not stripgaps:\r\n+\t\treturn ReadSeqsFast(FileName, False)\r\n+\r\n+\tSeqs = {}\r\n+\tId = ""\r\n+\tFile = open(FileName)\r\n+\twhile 1:\r\n+\t\tLine = File.readline()\r\n+\t\tif len(Line)'..b'abel.split(\';\')\r\n+\tfor Field in Fields:\r\n+\t\tif Field.startswith(Name + "="):\r\n+\t\t\tn = len(Name) + 1\r\n+\t\t\treturn Field[n:]\r\n+\tif Default == "":\r\n+\t\tDie("Field %s= not found in >%s" % (Name, Label))\r\n+\treturn Default\r\n+\r\n+def GetIntFieldFromLabel(Label, Name, Default):\r\n+\treturn int(GetField(Label, Name, Default))\r\n+\r\n+def GetFieldFromLabel(Label, Name, Default):\r\n+\treturn GetField(Label, Name, Default)\r\n+\r\n+def DeleteFieldFromLabel(Label, Name):\r\n+\tNewLabel = ""\r\n+\tFields = Label.split(\';\')\r\n+\tfor Field in Fields:\r\n+\t\tif len(Field) > 0 and not Field.startswith(Name + "="):\r\n+\t\t\tNewLabel += Field + \';\'\r\n+\treturn NewLabel\r\n+\r\n+def ReplaceSize(Label, Size):\r\n+\tFields = Label.split(";")\r\n+\tNewLabel = ""\r\n+\tDone = False\r\n+\tfor Field in Fields:\r\n+\t\tif Field.startswith("size="):\r\n+\t\t\tNewLabel += "size=%u;" % Size\r\n+\t\t\tDone = True\r\n+\t\telse:\r\n+\t\t\tif Field != "":\r\n+\t\t\t\tNewLabel += Field + ";"\r\n+\tif not Done:\r\n+\t\tdie.Die("size= not found in >" + Label)\r\n+\treturn NewLabel\t\r\n+\r\n+def Error(s):\r\n+\tprint >> sys.stderr, "*** ERROR ***", s, sys.argv\r\n+\tsys.exit(1)\t\r\n+\r\n+def ProgressFile(File, FileSize):\r\n+#\tif not sys.stderr.isatty():\r\n+#\treturn\r\n+\tPos = File.tell()\r\n+\tPct = (100.0*Pos)/FileSize\r\n+\tStr = "%s %5.1f%%\\r" % (FileName, Pct)\r\n+\tsys.stderr.write(Str)\r\n+\r\n+def Progress(i, N):\r\n+#\tif not sys.stderr.isatty():\r\n+\treturn\r\n+\tPct = (100.0*i)/N\r\n+\tStr = "%5.1f%%\\r" % Pct\r\n+\tsys.stderr.write(Str)\r\n+\r\n+def PrintLine():\r\n+\tprint(Line)\r\n+\r\n+def ParseRec(Line):\r\n+\tglobal Type\r\n+\tglobal ClusterNr\r\n+\tglobal Size\r\n+\tglobal PctId\r\n+\tglobal Strand\r\n+\tglobal QueryStart\r\n+\tglobal SeedStart\r\n+\tglobal Alignment\r\n+\tglobal QueryLabel\r\n+\tglobal TargetLabel\r\n+\tglobal LocalScore\r\n+\tglobal Evalue\r\n+\t\r\n+\tFields = Line.split("\\t")\r\n+\tN = len(Fields)\r\n+\tif N != 9 and N != 10:\r\n+\t\tError("Expected 9 or 10 fields in .uc record, got: " + Line)\r\n+\tType = Fields[0]\r\n+\t\r\n+\ttry:\r\n+\t\tClusterNr = int(Fields[1])\r\n+\texcept:\r\n+\t\tClusterNr = -1\r\n+\t\t\r\n+\ttry:\t\r\n+\t\tSize = int(Fields[2])\r\n+\texcept:\r\n+\t\tSize = -1\r\n+\r\n+\tFields2 = Fields[3].split(\'/\')\r\n+\tLocalScore = -1.0\r\n+\tEvalue = -1.0\r\n+\tif len(Fields2) == 3:\r\n+\t\ttry:\r\n+\t\t\tPctId = float(Fields2[0])\r\n+\t\t\tLocalScore = float(Fields2[1])\r\n+\t\t\tEvalue = float(Fields2[2])\r\n+\t\texcept:\r\n+\t\t\tPctId = -1.0\r\n+\telse:\r\n+\t\ttry:\r\n+\t\t\tPctId = float(Fields[3])\r\n+\t\texcept:\r\n+\t\t\tPctId = -1.0\r\n+\r\n+\tStrand = Fields[4]\r\n+\t\r\n+\ttry:\r\n+\t\tQueryStart = int(Fields[5])\r\n+\texcept:\r\n+\t\tQueryStart = -1\r\n+\r\n+\ttry:\r\n+\t\tSeedStart = int(Fields[6])\r\n+\texcept:\r\n+\t\tSeedStart = -1\r\n+\r\n+\tAlignment = Fields[7]\r\n+\tQueryLabel = Fields[8]\r\n+\t\r\n+\tif len(Fields) > 9:\r\n+\t\tTargetLabel = Fields[9]\r\n+\r\n+def GetRec(File, OnRecord):\r\n+\tglobal Line\r\n+\twhile 1:\r\n+\t\tLine = File.readline()\r\n+\t\tif len(Line) == 0:\r\n+\t\t\treturn 0\r\n+\t\tif Line[0] == \'#\':\r\n+\t\t\tcontinue\r\n+\t\tLine = Line.strip()\r\n+\t\tif len(Line) == 0:\r\n+\t\t\treturn 1\r\n+\t\tParseRec(Line)\r\n+\t\tOk = OnRecord()\r\n+\t\tif Ok != None and Ok == 0:\r\n+\t\t\treturn 0\r\n+\t\treturn 1\r\n+\r\n+def ReadRecs(argFileName, OnRecord, ShowProgress = False):\r\n+\treturn ReadFile(argFileName, OnRecord, ShowProgress)\r\n+\r\n+def ReadRecsOnRec(argFileName, OnRecord, ShowProgress = True):\r\n+\treturn ReadFile(argFileName, OnRecord, ShowProgress)\r\n+\r\n+def GetRecs(argFileName, OnRecord, ShowProgress = True):\r\n+\treturn ReadFile(argFileName, OnRecord, ShowProgress)\r\n+\r\n+def ReadFile(argFileName, OnRecord, ShowProgress = True):\r\n+\tglobal FileName\r\n+\tFileName = argFileName\r\n+\tFile = open(FileName)\r\n+\r\n+\tif ShowProgress:\r\n+\t\tprogress.InitFile(File, FileName)\r\n+\twhile GetRec(File, OnRecord):\r\n+\t\tif ShowProgress:\r\n+\t\t\tprogress.File()\r\n+\tif ShowProgress:\r\n+\t\tprogress.FileDone()\r\n+\r\n+OTUIds = []\r\n+SampleIds = []\r\n+OTUTable = {}\r\n+\r\n+ReadRecs(ucFileName, OnRec)\r\n+\r\n+fout=open(outFileName,\'w\')\r\n+\r\n+s = "OTUId"\r\n+for SampleId in SampleIds:\r\n+\ts += "\\t" + SampleId\r\n+\r\n+fout.write("%s\\n" % s)\r\n+\r\n+for OTUId in OTUIds:\r\n+\ts = OTUId\r\n+\tfor SampleId in SampleIds:\r\n+\t\ttry:\r\n+\t\t\tn = OTUTable[OTUId][SampleId]\r\n+\t\texcept:\r\n+\t\t\tn = 0\r\n+\t\ts += "\\t" + str(n)\r\n+\tfout.write("%s\\n" % s)\r\n+\r\n+fout.close()\r\n' |
b |
diff -r 000000000000 -r e85e7ba38aff uclust2otutable.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/uclust2otutable.xml Mon Sep 14 04:52:36 2020 +0000 |
b |
@@ -0,0 +1,98 @@ +<tool id="uclust2otutable" name="OTUTable" version="1.0.0"> + <description>Convert UCLUST format from Vsearch to OTU Table</description> + <version_command> + python ${__tool_directory__}/uclust2otutable.py --version + </version_command> + <command detect_errors="aggressive"> + python ${__tool_directory__}/uclust2otutable.py + -i '$inputfile' + -o '$output' + </command> + <inputs> + <param format="tabular" name="inputfile" type="data" label="UCLUST from Vsearch" /> + </inputs> + <outputs> + <data format="tabular" name="output" label="OTU_TABLE_${inputfile.display_name}"/> + </outputs> + <tests> + <test> + <param name="inputfile" value="uc_input.txt"/> + <output name="output" file="uc_output.txt"/> + </test> + </tests> + + <help> +** what it does ** + +Converts UCLUST format (.uc) output from Vsearch search into raw count table. The description of UCLUST format is based on the information that can be found on UCLUST_ documentation page. + +.. _UCLUST: http://www.drive5.com/uclust/uclust_userguide_1_1_579.html + +-------- + +======= +Example +======= + +Some example records: +--------------------- + +==== ======= ==== ==== ====== === === ========= ========== ====== +Type Cluster Size %Id Strand Qlo Tlo Alignment Query Target +---- ------- ---- ---- ------ --- --- --------- ---------- ------ + S 0 292 '*' '*' '*' '*' '*' AH70_12410 '*' + H 0 292 99.7 '+' 0 0 292M AH70_12410 '*' + S 0 292 '*' '*' '*' '*' '*' AH70_12410 '*' + H 0 292 98.2 '+' 0 0 292M AH70_12410 '*' +==== ======= ==== ==== ====== === === ========= ========== ====== + +Each record has ten fields, separated by tabs: +---------------------------------------------- + +========= =========================================== +Column Description +--------- ------------------------------------------- +Type Record type +Cluster Cluster number +Size Sequence length or cluster size +%Id Identity to the seed(as a percentage), or * if this is a seed. +Strand '+' plus strand, '-' minus strand, or '.' amino acids. +Qlo 0-based coordinate of alignment start in the query sequence. +Tlo 0-based coordinate of alignment start in target (seed) sequence. If minus strand, Tlo is relative to start of reverse-complement target. +Alignment Compressed representation of alignment to the seed(see below), or '*' if a seed. +Query FASTA label of query sequence +Target FASTA label of target(seed / library / database) sequence. or '*' if a seed. +========= =========================================== + +Record Types are: +----------------- + +====== =========================================== +Column Description +------ ------------------------------------------- +L Library seed(generated only if a match if found to this seed). +S New seed. +H Hit, also known as an accept; i.e. a successful match. +D Library cluster. +C New cluster. +N Not matched (a sequence that didn't match library with --libonly specified). +R Reject (generated only if --output_rejects is specified) +====== =========================================== + +The alignment is compressed using run-length encoding, as follows. Each column in the alignment is classified as M,D or I: +-------------------------------------------------------------------------------------------------------------------------- + +==== ====== ============== ============= +Code Name Query sequence Seed sequence +---- ------ -------------- ------------- +M Match Letter Letter +D Delete Gap Letter +I Insert Letter Gap +==== ====== ============== ============= + +-------- + + + </help> + +</tool> |