Announcement

Collapse
No announcement yet.

New utility: Whitelisting Tools for Apache ModSecurity

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    New utility: Whitelisting Tools for Apache ModSecurity

    TL/DR: I have written a CLI utility for Ubuntu to import ModSecurity's audit log file into an sqlite database, which should help people to build a whitelist and reduce false positives. A PPA is available.

    Even if you don't use Apache, you might find this interesting. To create my app I had to learn about C++ development on Ubuntu including two third party libraries (Boost Regex and SQLite), version control using Git, the GNU build system "Autotools", how to package software for Ubuntu and Debian, and how to upload packages to a Personal Package Archive (PPA) on Launchpad.

    I hope this will spark some interesting discussion. I'd love to hear other people's experiences with any of the above, particularly ubuntu development.

    --------------------------------------------------------------------------------------

    What is ModSecurity
    You may recall some conversations we had before on this forum about Apache2's security module "ModSecurity".

    For those of you who haven't used it, ModSecurity is a Web Application Firewall that can be used with a set of rules to "enumerate badness" and decide when to block requests sent to the server. It sits inbetween Apache and the web applications running on the server, and can therefore intercept malicious requests before they are processed by the app. Probably the most common set of rules is the Open Web Application Security Project's Core Rule Set (OWASP CRS), which is in the ubuntu repos as modsecurity-crs.

    Here's a typical example: Mr Naughty is trying to hack example.com, a website running a vulnerable installation of WordPress on a LAMP server. Mr Naughty is trying to use an SQL injection attack to create a new admin user in the database so that he can deface the site, steal data etc. However, ModSecurity identifies the SQL injection attack contained in the POST variable sent by Mr Naughty and blocks it before it is executed by Wordpress. The attack fails

    Sounds great, right?

    Why isn't is more popular?

    I started learning about ModSecurity after Steve recommended it to me, about a year and a half ago. As an enthusiastic but inexperienced amateur, I really struggled to configure it properly - each rule is using pattern matching to decide what to block, and there are inevitable false positives.

    This means you can't just install it and expect it to work, typically you run ModSecurity in "detection only" mode for a time (rules are evaluated but ModSecurity doesn't actually block anything), and then inspect the audit logs to identify where you need to make amendments to the rules to remove those false positives.

    The audit log is a text file with sections for each part of the transaction: the data sent to the server, the response sent back, and any rules that were matched. Since the data for each transaction is split over multiple lines, it does not lend itself to being sorted with simple utilities like grep. Identifying all of the requests from a certain IP address that triggered a given rule is a non-trivial exercise.

    Initial Solutions

    My first attempt at tackling the problem was to remove the rules that were being triggered at certain locations. To do this I wrote a BASH script, which you can find with a description on my website. The script doesn't look at the audit log file, it just uses the error messages ModSecurity writes to the apache log, and spits out a virtualhost configuration file listing locations (URLs) where certain rules are disabled.

    This would work OK if you were running ModSecurity in "traditional" mode, where any rule that is matched results in the request being blocked, but it isn't good for the new anomaly scoring mode (the one that enumerates badness). In the anomaly scoring mode, each rule has a point score and the request is blocked if the score passes over a threshold... I soon realised that my script above was actually just removing the rule that adds up the scores and blocks the request, when it should have been removing the individual rules!

    This wasn't good enough. I realised I needed a more fine-tuned approach, so I learned some Perl. Perl can do multiline regex (slowly!), which enabled me to look at the audit log instead of the error log. The perl script I wrote splits the audit log into bits and puts it into a spreadsheet. This is the same fundamental approach as my C++ app, but the spreadsheet quickly becomes extremely sluggish, and the script takes ages to run. It does work, though!

    The Solution: auditlog2db commandline utility

    So, after my partial success with Perl I decided I needed something serious to tackle the problem. I had read that C++ apps are generally faster than scripting languages like Perl, and wanted to learn the language that our OS is written in. I had an idea that a sqlite database would be a good way to store the information from the audit logs so that it could be sorted quickly, but I didn't know any C++ or anything about sqlite.

    I learned:
    • Some basic C++ (hair-tearingly frustrating at times but ultimately rewarding)
    • How to use the C/C++ sqlite API (reasonably well documented but very confusing to someone writing their first C++ app)
    • How to do regular expression matching in C++ using the Boost Regex library (much more difficult than perl!)
    • How to use a Makefile to make compilation less tedious.


    The result is a C++ commandline utility called auditlog2db that will import the logfile into a sqlite3 database. It can process about 2000 transactions per second, which is about a bazillion times faster than the perl script

    As my code got more complicated, I realised I needed to use a proper version control system instead of just saving copies of files as foo.BAK, foo.BAK2 ... so I learned Git. Git is actually quite accessible and definitely worth learning.

    Packaging
    So, at this point my code was on Github and it worked, but I doubted very much whether anyone would find it and use it. Seriously... in 2015, you shouldn't have to compile a program yourself unless you're actually developing it.

    Packaging my code for Ubuntu/Debian turned out to be almost as difficult as writing the damn program!

    I started by learning the GNU build system, Autotools, to replace my handwritten Makefile with a more flexible one. Autotools is the group of programs that are used in the classic "configure, make, make install" procedure to check dependencies and create a makefile that installs everything to the correct place on your system and removes them cleanly again afterwards.

    Autotools turned out to be a nightmare. It is not at all easy to learn - something as simple as testing for C++11 support in the compiler and setting the appropriate flag should be easy, but it's not, and requires the use of some pretty archaic m4 macros. The documentation is sparse, and non-trivial example tutorials are hard to come by.

    In defence of Autotools, once it is set up, the "configure, make, install" procedure is easy to do - I can see why it was good in the days when end users were required to compile software. It also provides some nice features like "make dist", which creates a .tar.gz source archive for distribution - useful for starting a .deb! If I could start over, I think I would learn cmake, which is supposed to be easier.


    Once this was all sorted, I set to work building a .deb package. The Ubuntu documentation can pretty much be summed up in one sentence:
    Build a package as you would for debian, but use the ubuntu release codename (utopic) instead of the Debian codename (unstable) in the changelog file.
    After battling my way through the Debian new maintainer's guide, I finally produced a package. However, the package checker lintian kicked up a load of errors. Some were trivial like line lengths in the package description, but others were more serious. A manual file was missing.

    Yet another unpleasant surprise: manual files are written using nroff, a markup language even more difficult to learn than TeX, and a lot less useful. Luckily, it was possible to just plagiarise borrow a lot of the markup from other manfiles, which are stored in /usr/share/man/man1/foo.1.gz. Take a look at a few using the "zless" command, and you'll see why I wasn't enthused at the prospect of writing one from scratch.

    Distribution
    Package completed, error free, it was time to upload to a PPA.

    The next surprise was that you can't just create a .deb, sign it, and upload it to a PPA. Launchpad builds the binaries itself from a source archive!

    This is a great for quality control, since the packages are built in a sanitised chroot. In fact, this caught a few of my errors, like missing libsqlite3-dev and libboost-regex-dev from the build-depends field in the control file. These libraries were (obviously) installed on my laptop already, but they weren't present in the chroot, so the compiler failed during linking.

    After a bit of trial and error, I got Launchpad to build my app successfully

    My PPA is here:

    https://launchpad.net/~sam-hobbs/+ar...elisting-tools

    At the moment, packages are only available for 14.10 Utopic Unicorn.

    If you fancy helping me out (I'd really appreciate it!), you can add the PPA, install the package, test it and remove it.

    The package is called ams-whitelisting-tools because I plan on adding other utilities to the package later.

    Obligatory warning: in general, you shouldn't add random PPAs to your system. Only do this if you trust me!

    The following code will add the PPA, install my utility, do some basic tests, and then remove the package and the PPA.

    Code:
    sudo add-apt-repository ppa:sam-hobbs/ams-whitelisting-tools
    sudo apt-get update
    sudo apt-get install ams-whitelisting-tools
    man auditlog2db
    auditlog2db --version
    sudo apt-get remove --purge ams-whitelisting-tools
    sudo add-apt-repository --remove ppa:sam-hobbs/ams-whitelisting-tools
    If you have modsecurity installed, you can also run this command to generate a database from your audit log file:

    Code:
    auditlog2db -i /var/log/apache2/modsec_audit.log -o ~/modsecurity.db
    The utility is still very much in development, but it is at the stage now where it could be useful to people and I'm very pleased to be able to release something!

    If you do any serious testing, I'd love to hear some feedback. I'm aware that I need to tighten up the --force and --quiet options, which were added recently.

    Thanks
    I only started using Linux a couple of years ago, and have learned a huge amount since then, thanks mainly to this forum and the patient help I have received here.

    Thanks especially to Steve for getting me interested in ModSecurity, and GreyGeek for encouraging me want to learn C++!
    samhobbs.co.uk

    #2
    Interesting read, at least to me. Thank you.

    Naturally I'd suspect that a good Perl or Python programmer could make a version as fast effectively as C++, unless there was some serious compute-bound magic going on, in which case it would be best isolated and made into a library callable by Perl or Python. But if it was easier to do in C++ for you, hats off to you.
    Regards, John Little

    Comment


      #3
      Yeah i know what you mean, and i think you're probably right.

      To be fair, I'm not making a like for like comparison because the Perl script was using a library from CPAN to write the data to an excel file and I only started using sqlite when I learned C++. I'm sure I could make a fast Perl script using sqlite, but another factor in my decision to learn and use C++ was that C/C++ seems to get top priority when it comes to APIs, whereas Perl and Python APIs are sometimes less polished/stable... or so I've heard!

      There must be a huge overhead associated with writing to a file because when I enclosed everything in a single transaction the speed jumped from about 5/s to 2000/s. That was just two lines of code! Amazing.

      Thanks for an interesting reply
      Last edited by Feathers McGraw; Feb 24, 2015, 12:46 AM. Reason: Typo
      samhobbs.co.uk

      Comment


        #4
        Outstanding progress in two years!

        Yes, CMake would have been a better choice. This well-informed rant against Autotools might have been a good thing for you to find first, heh.

        Regarding nroff, it is pathetic that, in 2015, it is still necessary to use a clumsy text formatting system developed in the 1970s! The BSDs switched to mandoc in 2010. Linux should follow their example.

        Comment


          #5
          Originally posted by SteveRiley View Post
          Regarding nroff, it is pathetic that, in 2015, it is still necessary to use a clumsy text formatting system developed in the 1970s!
          Wow - I remember using SCRIPT in IBM back in the 80s, complete with formatting instructions beginning with a period in column 1, as the history describes from the original RUNOFF (1964).
          Code:
          .pa 
          This is a new page
          .h1 This is a heading 1
          It was the old hands who thought this was just the best thing and were resisting the newer GML which introduced tags starting with a colon and ending with a dot and could be anywhere in the line.
          Code:
          .pa
          This is a new page
          :h1.This is a heading 1
          and end tags with :e... (like :ul. ... :eul. unordered list)
          Replace the tag markers :eul. --> </ul> and you have SGML ... and HTML!
          I'd rather be locked out than locked in.

          Comment


            #6
            Originally posted by SteveRiley View Post
            Outstanding progress in two years!
            Thank you sensei

            Yes, CMake would have been a better choice. This well-informed rant against Autotools might have been a good thing for you to find first, heh.
            You know what, I actually read that when i had already started implementing autotools support. I resigned myself to the fact that I would probably have to learn cmake in gthe future, but wanted to make autotools work before I gave up on it so i understood why people hate it...and now I knowl

            Regarding nroff, it is pathetic that, in 2015, it is still necessary to use a clumsy text formatting system developed in the 1970s! The BSDs switched to mandoc in 2010. Linux should follow their example.
            I agree... plenty of things would be better - html or something similar to github's markdown. if you want people to write good documentation you have to at least make it easy.
            samhobbs.co.uk

            Comment

            Working...
            X