Announcement

Collapse
No announcement yet.

delete large range of lines in kate text file

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    delete large range of lines in kate text file

    Hello,
    I have a xml-file containing some 40.000 lines. Of these I want to delete a LARGE range of which I know how to specify the beginning and end-lines. Is there a way to e.g. say "delete fromLine# toLine#"?
    Thanks for any suggestions
    H. Stoellinger

    #2
    Many tools do this. Maybe the simplest would be sed:
    Code:
    sed '/beginning-pattern/,/end-pattern/d'  < input > output
    But, watch out, sed and the like are line-oriented, and sometimes xml is not, and the file looks like one very long line, or with very long lines with your patterns in the middle somewhere.

    Me, I'd just edit the file with vim. Editors can cope with GB sized files these days, if a little slowly. But kate... it can do it, if you know how to use javascript with kate, but I don't. IMO kate should be able to extend the selection while moving to a bookmark, with Shift-Alt-PgDn.
    Regards, John Little

    Comment


      #3
      deleting many lines in large files in kate

      Thanks for the prompt answer. I agree that kate should be able to do what you suggest. I am probably one of a few linux guys who don't use vim (or emacs for that matter!).
      I have used nano, but my "quick" editor - called from bash - is micro. However there isn't a "del linefrom lineto" or "del fromMark toMark" under micro either. Of my file of more than 40000 lines I only needed the first 2000 lines, together with the last 3 lines. So, after a bit of "quick" thinking, I selected these and pasted them into a fresh file.
      That went quite quickly. Of course it doesn't negate the requirement for kate! And it is only a "special" case...
      Regards
      H. S.

      Originally posted by jlittle View Post
      Many tools do this. Maybe the simplest would be sed:
      Code:
      sed '/beginning-pattern/,/end-pattern/d'  < input > output
      But, watch out, sed and the like are line-oriented, and sometimes xml is not, and the file looks like one very long line, or with very long lines with your patterns in the middle somewhere.

      Me, I'd just edit the file with vim. Editors can cope with GB sized files these days, if a little slowly. But kate... it can do it, if you know how to use javascript with kate, but I don't. IMO kate should be able to extend the selection while moving to a bookmark, with Shift-Alt-PgDn.

      Comment


        #4
        You can left click a line number, scroll down and shift left click the second line number and it will delete all the lines between and including.

        Comment


          #5
          Originally posted by Bings View Post
          You can left click a line number, scroll down and shift left click the second line number and it will delete all the lines between and including.
          "scroll down" is the problem; I tried editing a 30,000,000 line file in kate, then deleting line 1,000,000 to 2,000,000. I could go to line 1,000,000, or 2,000,000, but it would move the cursor, so extending the selection didn't work. Scrolling down 1,000,000 lines with shift+page down would take several minutes, even with my keyboard set to 50 Hz.

          I tried harder and have just found that one can use the "scrollbar mini map" to move leaving the cursor behind, and so approximately extend the selection, and reasonably correct it using shift+PgUp and shift+Up. However, trying to delete the million line selection, it was too slow; I killed it after 20 minutes. sed took 2.5 seconds, and vim read, deleted, and saved in 40 s or so.
          Regards, John Little

          Comment


            #6
            Well, I've never had one that big to put in Kate.

            You can use Ctrl+G to go to a specific line or it's at the bottom of the edit menu. Obviously that doesn't help with the size of file issue. I thought Kate gave a warning over a certain size of file about not working well.

            Comment


              #7
              If a command-line solution is acceptable
              and
              if the strings demarcating the content to be removed are unique
              try
              Code:
              perl -i -p0e 's/START.*STOP/replace_string/smg' file_to_change
              Source
              Kubuntu 20.04

              Comment


                #8
                In 1998 I used a text editor to process an almost 1GB file that filled an file from a Kodak Unix file server. It was very fast and easy to use, allowing me to pick out the good indexes from the bad. I eventually recovered all but a couple hundred of the 50,000 indexes and jpegs stored on it.

                I began using my smartpagefu to try to locate it because I couldn't remember its name. No joy, but I did recall some others:
                Besides vim, emacs, kate, Write, etc...

                ncurses text editors:
                split
                ed
                sed
                awk
                joe
                vi
                mc
                wily

                GUI
                CudaText

                Hex editor
                Tweak
                Tweak uses a complex data structure based on B-trees, designed to make almost all editing operations extremely fast, even when they are working on huge amounts of data.
                The data structure is described in detail on a separate page for those interested. The bottom line is:
                • Tweak supports insert mode (not particularly useful if you're editing an executable file or a filesystem image, but can be extremely handy in other file formats such as PNG).
                • Cutting, copying and pasting within the file you are editing is extremely efficient. No matter how big the chunk of data you are moving around - even if it's a 200Mb section of a CD image - Tweak will always perform the operation effectively instantly.
                • Tweak supports lazy loading of the input file: rather than sucking it all into memory straight away, it simply remembers which parts of the editing buffer are copies of which parts of the input file and refers to the file on disk when it needs to. Tweak only has to take significant time when you genuinely need it to read the entire file. The only two operations with this property are searching, and saving the modified version of the file to disk. Everything else is instant.

                https://www.chiark.greenend.org.uk/~...page-3.02.html
                Interesting example using stat, head, tail, dd, echo:
                https://unix.stackexchange.com/quest...-on-a-system-w


                Proprietary editors:

                $50 for home, $130 commercial
                010 Editor

                $100 for named user
                slickEdit
                Last edited by GreyGeek; Dec 31, 2020, 02:22 PM.
                "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
                – John F. Kennedy, February 26, 1962.

                Comment

                Working...
                X