天下有比grep实现更好、更快、更强大的grep吗

发布: 2008-06-10 10:21

如果你没有看到这个软件,看到它的介绍,真正试用,真的是很难相信。

这个软件的名字叫ack , 其网站标榜: "better than grep, a search tool for programmers"

一个gentoo开发人员对其评介为”令人激动的、在某些情况下可替换grep的新工具",并且他还对 ack 站点列出的“十大胜出理由“一一做了解释。

但是这个软件即使更快也不能完全取代grep,因为它运行的时候用到perl解释器及perl标准模块。
既然用perl解释的,那么它怎么能比c写的grep快呢? That's really a problem.

这个gentoo开发人员在写一个ack的每日小技巧的文档,链接是: ack 每日小技巧



十大胜出理由:





  1. It's blazingly fast because it only searches the stuff you want
    searched.


    Wait, how does it know what I want? A "http://en.wikipedia.org/wiki/DWIM">DWIM-Interface at last? Not
    quite. First off, ack is faster than grep
    for simple searches. Here's an example:



    $ time ack 1Jsztn-000647-SL exim_main.log >/dev/null
    real 0m3.463s
    user 0m3.280s
    sys 0m0.180s
    $ time grep -F 1Jsztn-000647-SL exim_main.log >/dev/null
    real 0m14.957s
    user 0m14.770s
    sys 0m0.160s

    Two notes: first, yes, the file was in the page cache before I
    ran ack; second, I even made it easy for grep by telling
    it explicitly I was looking for a fixed string (not that it helped
    much, the same command without -F was faster by about
    0.1s). Oh and for completeness, the exim logfile I searched has
    about two million lines and is 250M. I've run those tests ten times
    for each, the times shown above are typical.


    So yes, for simple searches, ack is faster than grep.
    Let's try with a more complicated pattern, then. This time, let's
    use the pattern (klausman|gentoo) on the same file. Note
    that we have to use -E for grep to use extended
    regexen, which ack in turn does not need, since it
    (almost) always uses them. Here, grep takes its sweet
    time: 3:56, nearly four minutes. In contrast, ack
    accomplished the same task in 49 seconds (all times averaged over
    ten runs, then rounded to integer seconds).


    As for the "being clever" side of speed, see below, points 5 and
    6




  2. ack is pure Perl, so it runs on Windows just fine.


    This isn't relevant to me, since I don't use windows for
    anything where I might need grep. That said, it might be a killer
    feature for others.




  3. The standalone version uses no non-standard modules, so you can
    put it in your ~/bin without fear.


    Ok, this is not so much of a feature than a hard criterion. If I
    needed extra modules for the whole thing to run, that'd be a deal
    breaker. I already have tons of libraries, I don't need more
    undergrowth around my dependency tree.




  4. Searches recursively through directories by default, while
    ignoring .svn, CVS and other VCS directories.


    This is a feature, yet one that wouldn't pry me away from grep:
    -r is there (though it distinctly feels like an
    afterthought). Since ack ignores a certain set of files
    and directories, its recursive capabilities where there from the
    start, making it feel more seamless.




  5. ack ignores most of the crap you don't want to search


    To be precise:




    • VCS directories

    • blib, the Perl build directory

    • backup files like foo~ and #foo#

    • binary files, core dumps, etc.



    Most of the time, I don't want to search those (and have to
    exclude them with grep -v from find results). Of
    course, this ignore-mode can be switched off with ack
    (-u). All that said, it sure makes command lines shorter
    (and easier to read and construct). Also, this is the first spot
    where ack's Perl-centricism shows. I don't mind, even though I
    prefer that other language with
    P
    .




  6. Ignoring .svn directories means that ack is faster than grep
    for searching through trees.


    Dupe. See Point 5




  7. Lets you specify file types to search, as in --perl or
    --nohtml.


    While at first glance, this may seem limited, ack comes
    with a plethora of definitions (45 if I counted correctly), so it's
    not as perl-centric as it may seem from the example. This feature
    saves command-line space (if there's such a thing), since it avoids
    wild find-constructs. The docs mention that --perl also
    checks the shebang line of files that don't have a suffix, but make
    no mention of the other "shipped" file type recognizers doing
    so.




  8. File-filtering capabilities usable without searching with ack
    -f. This lets you create lists of files of a given type.


    This mostly is a consequence of the feature above. Even if it
    weren't there, you could simply search for "."




  9. Color highlighting of search results.


    While I've looked upon color in shells as kinda childish for a
    while, I wouldn't want to miss syntax highlighting in vim, colors
    for ls (if they're not as sucky as the defaults we had for years)
    or match highlighting for grep. It's really neat to see that yes,
    the pattern you grepped for indeed matches what you think it does.
    Especially during evolutionary construction of command lines and
    shell scripts.




  10. Uses real Perl regular expressions, not a GNU subset


    Again, this doesn't bother me much. I use
    egrep/grep -E all the time, anyway. And I'm no
    Perl programmer, so I don't get withdrawal symptoms every time I
    use another regex engine.




  11. Allows you to specify output using Perl's special
    variables


    This sounds neat, yet I don't really have a use case for
    it. Also, my perl-fu is weak, so I probably won't use it anyway.
    Still, might be a killer feature for you.


    The docs have an example:


    ack '(Mr|Mr?s)\. (Smith|Jones)'
    --output='$&'



  12. Many command-line switches are the same as in GNU grep:


    Specifically mentioned are -w, -c and
    -l. It's always nice if you don't have to look up all the
    flags every time.




  13. Command name is 25% fewer characters to type! Save days of
    free-time! Heck, it's 50% shorter compared to grep -r


    Okay, now we have proof that not only the ack webmaster
    can't count, he's also making up reasons for fun. Works for me.





原文: http://qtchina.tk/?q=node/182

Powered by zexport