织梦CMS - 轻松建站从此开始!

沙龙国际_沙龙国际亚洲第一品牌网上娱乐

当前位置: 主页 > 沙龙365 >

python 文本格式转换代码优化

时间:2017-07-16 18:03来源:未知 作者:admin 点击:
格式转换代码见下边,就是代码运行起来很慢,想看看大家是否有优化方案 Number of segment pairs = 182; number of pairwise comparisons = 40' ' means given segment; '-' means reverse complementOverlaps Containments

格式转换代码见下边,就是代码运行起来很慢,想看看大家是否有优化方案


Number of segment pairs = 182; number of pairwise comparisons = 40
' ' means given segment; '-' means reverse complement

Overlaps            Containments  No. of Constraints Supporting Overlap

******************* SCL279Contig1 ********************
whvs7e09.R 
24990481 
******************* SCL279Contig2 ********************
et|RFL_Contig5917 
                    24993123  is in et|RFL_Contig5917 
                    whsctal27f01.R  is in et|RFL_Contig5917 
                    whxn27054l15.R  is in et|RFL_Contig5917 
                    whsctal3n06.F- is in et|RFL_Contig5917 
                    whthkles18l09.R  is in et|RFL_Contig5917 
                    whsctal3n06.R  is in et|RFL_Contig5917 
                    32771503  is in whsctal3n06.R 
                    32678311  is in et|RFL_Contig5917 
whxn27054l15.F-

DETAILED DISPLAY OF CONTIGS
******************* SCL279Contig1 ********************
                          .    :    .    :    .    :    .    :    .    :    .    :
whvs7e09.R            CCGCAGGGGTCGCGGCCGAGCAGGGCGGCGCCGGCGACGAGCTCTCCGCTCTGTTCAAGG
                      ____________________________________________________________
consensus             CCGCAGGGGTCGCGGCCGAGCAGGGCGGCGCCGGCGACGAGCTCTCCGCTCTGTTCAAGG

                          .    :    .    :    .    :    .    :    .    :    .    :
whvs7e09.R            AGTTCTCGCTAGACAGCAGCAGCACCTTCGCCGAGGCGCGGATCCGGGCCACCTTCTACC
                      ____________________________________________________________
consensus             AGTTCTCGCTAGACAGCAGCAGCACCTTCGCCGAGGCGCGGATCCGGGCCACCTTCTACC

                          .    :    .    :    .    :    .    :    .    :    .    :
whvs7e09.R            CCAAGTTCGAGAACGAGGAATCCGACCAGGAGTCAAGAACCCGGATGATTGAGATGGTGT
                      ____________________________________________________________
consensus             CCAAGTTCGAGAACGAGGAATCCGACCAGGAGTCAAGAACCCGGATGATTGAGATGGTGT

                          .    :    .    :    .    :    .    :    .    :    .    :
whvs7e09.R            CACAAGGATTAGCTACCATGGAGGTTACGCTCAAGCATTCAGGGTCTTTGTTCATGTATG
24990481                                                                TTCATGTATG
                      ____________________________________________________________
consensus             CACAAGGATTAGCTACCATGGAGGTTACGCTCAAGCATTCAGGGTCTTTGTTCATGTATG

                          .    :    .    :    .    :    .    :    .    :    .    :
whvs7e09.R            CTGGTAACCGTGGTGGTGCATATGCCAAGAACAGCTTTGGAAATATCTACACTGCTGTGG
24990481              CTGGTAACCGTGGTGGTGCATATGCCAAGAACAGCTTTGGAAATATCTACACTGCTGTGG
                      ____________________________________________________________
consensus             CTGGTAACCGTGGTGGTGCATATGCCAAGAACAGCTTTGGAAATATCTACACTGCTGTGG

                          .    :    .    :    .    :    .    :    .    :    .    :
whvs7e09.R            GCGTTTTTGTTTTGGGTCGCTTGTTTCGTGAAGCTTGGGGGAGAGAAGCTCCTAAAATGC
24990481              GCGTTTTTGTTTTGGGTCGCTTGTTTCGTGAAGCTTGGGGGAGAGAAGCTCCTAAAATGC
                      ____________________________________________________________
consensus             GCGTTTTTGTTTTGGGTCGCTTGTTTCGTGAAGCTTGGGGGAGAGAAGCTCCTAAAATGC

                          .    :    .    :    .    :    .    :    .    :    .    :
whvs7e09.R            AAGCGGAATTCAATGATTGTCTCGAGAAAAACCGAATAAGCATTTCAATGGAACTTGTCA
24990481              AAGCGGAATTCAATGATTGTCTCGAGAAAAACCGAATAAGCATTTCAATGGAACTTGTCA
                      ____________________________________________________________
consensus             AAGCGGAATTCAATGATTGTCTCGAGAAAAACCGAATAAGCATTTCAATGGAACTTGTCA

                          .    :    .    :    .    :    .    :    .    :    .    :
whvs7e09.R            CGGCTGTATTAGGAGACCATGGGCAAAGGCCTAAGGATGATTATGCTGTGATTACAGCTG
24990481              CGGCTGTATTAGGAGACCATGGGCAAAGGCCTAAGGATGATTATGCTGTGATTACAGCTG
                      ____________________________________________________________
consensus             CGGCTGTATTAGGAGACCATGGGCAAAGGCCTAAGGATGATTATGCTGTGATTACAGCTG


Number of segment pairs = 600; number of pairwise comparisons = 172
' ' means given segment; '-' means reverse complement

Overlaps            Containments  No. of Constraints Supporting Overlap

******************* SCL292Contig1 ********************
whsctal16b21.F 
                    9428019- is in whsctal16b21.F 
                    whxq28060f16.F  is in whsctal16b21.F 
                    whxq29060f16.F  is in whxq28060f16.F 
whxn14056i15.F 
whxq28060f16.R-
                    9362629- is in whxq28060f16.R-
whxq29060f16.R-
******************* SCL292Contig2 ********************
et|tplb0013k02 
                    whthls7l03.R  is in et|tplb0013k02 
                    9561217  is in et|tplb0013k02 
                    9561363  is in et|tplb0013k02 
                    93033944  is in 9561363 
                    14317289  is in et|tplb0013k02 
                    32659187  is in et|tplb0013k02 
                    32663705  is in et|tplb0013k02 
                    93191970  is in et|tplb0013k02 
                    32786704  is in et|tplb0013k02 
                    whsctal16b21.R  is in 32786704 
                    33217630  is in et|tplb0013k02 
                    whxn14056i15.R  is in 33217630 
                    55676375  is in et|tplb0013k02 
                    93032669- is in et|tplb0013k02 
                    whv16n6d15.F- is in 93032669-
                    whv16n6d15.R  is in 93032669-

DETAILED DISPLAY OF CONTIGS
******************* SCL292Contig1 ********************
                          .    :    .    :    .    :    .    :    .    :    .    :
whsctal16b21.F        GGCATACTATAGCATCATTGTGGTCTGGAAACATTGGAGGGCTATAATGAAAAAAAATAC
                      ____________________________________________________________
consensus             GGCATACTATAGCATCATTGTGGTCTGGAAACATTGGAGGGCTATAATGAAAAAAAATAC

                          .    :    .    :    .    :    .    :    .    :    .    :
whsctal16b21.F        TAAATTGAGTTGAAGTCCAAGGAATTAGTGCCATACAACAACTGAAACTTTCTGGTGCTA
9428019-                                        AGTGCCATACAACAACTGAAACTTTCTGGTGCTA
whxq28060f16.F                  TCAAGTCCAAGGAATTAGTGCCATACAACAACTGAAACTTTCTGGTGCTA
whxq29060f16.F                  TCAAGTCCAAGGAATTAGTGCCATACAACAACTGAAACTTTCTGGTGCTA
whxn14056i15.F                                      CCATACAACAACTGAAACTTTCTGGTGCTA
                      ____________________________________________________________
consensus             TAAATTGAGTTCAAGTCCAAGGAATTAGTGCCATACAACAACTGAAACTTTCTGGTGCTA

                          .    :    .    :    .    :    .    :    .    :    .    :
whsctal16b21.F        CTTACACCTGGGTCAGGCTCTTGCAGAGCTGGAGCAAATTTGTAGCTCAGCGTTGCAATG
9428019-              CTTACACCTGGGTCAGGCTCTTGCAGAGCTGGAGCAAATTTGTAGCTCAGCGTTGCAATG
whxq28060f16.F        CTTACACCTGGATCAGGCTCGTGCTGAGCTGGAGCAAATTTGTAGCTCAGCATTGCAATG
whxq29060f16.F        CTTACACCTGGATCAGGCTCGTGCTGAGCTGGAGCAAATTTGTAGCTCAGCATTGCAATG
whxn14056i15.F        CTTACACCTGGATCAGGCTCGTGCTGAGCTGGAGCAAATTTGTAGCTCAGCATTGCAATG
whxq28060f16.R-                  ATCAGGCTCGTGCTGAGCTGGAGCAAATTTGTAGCTCAGCATTGCAATG
whxq29060f16.R-                  ATCAGGCTCGTGCTGAGCTGGAGCAAATTTGTAGCTCAGCATTGCAATG
                      ____________________________________________________________
consensus             CTTACACCTGGATCAGGCTCGTGCTGAGCTGGAGCAAATTTGTAGCTCAGCATTGCAATG

需要的结果是:


SCL279Contig1 whvs7e09.R 24990481
SCL279Contig2 et|RFL_Contig5917 24993123 whsctal27f01.R whxn27054l15.R whsctal3n06.F  whthkles18l09.R whsctal3n06.R 32771503 32678311 whxn27054l15.F-
SCL292Contig1 whsctal16b21.F 9428019 whxq28060f16.F whxq29060f16.F whxn14056i15.F whxq28060f16.R 9362629 whxq29060f16.R 
SCL292Contig2 et|tplb0013k02 whthls7l03.R 9561217 9561363 93033944 14317289 32659187 32663705 93191970 32786704 whsctal16b21.R 33217630 whxn14056i15.R 55676375 93032669 whv16n6d15.F whv16n6d15.R

我写的代码


# encoding: utf-8

with open('1.txt', 'r') as f:
    a = []
    b = []
    for num, line in enumerate(f):
        if 'Overlaps' in line:
            a.append(num)
        if 'DETAILED' in line:
            b.append(num)
    f.seek(0, 0)
    for i, j in zip(a, b):
        lines = f.readlines()[i:j 1]  #读取指定的行数
        if '**' in ''.join(lines):
            for i in range(len(lines)): #下边就是对获取的行数进行格式转换
                new = lines[i].strip().split()
                if '*******************' in new:
                    print '\n'   new[1],
                elif len(new) == 1:
                    print new[0][:-1],
                elif 'in' in new:
                    print new[0][:-1],
        f.seek(0, 0)

结果大概有10万条,上述代码运行了6个小时还没运行完,哪位大侠有优化的好点子吗?谢谢指点

织梦二维码生成器
顶一下
(0)
0%
踩一下
(0)
0%
------分隔线----------------------------
发表评论
请自觉遵守互联网相关的政策法规,严禁发布色情、暴力、反动的言论。
评价:
表情:
用户名: 验证码:点击我更换图片
栏目列表
推荐内容