ProSPOTLIGHT Menu

Member

Brian Broderick

Avatar_sm

My ProSPOTLIGHT

ProFILE

ProSITE

ProBLOG

ProCARD

ProSCORE

ProBLOG™

Programming the Internet

Adding to the Internet, one line of code at a time...

Using grep to find the number of occurances in a file

In doing some Linux server maintenance, I noticed that the server was using quite a bit more of its CPU resources than normal, yet my Analytics wasn't showing much of a spike in traffic. I have a rather large Apache access_log file, and I wanted to see how many times a particular bot spidered my web pages.  Looking through it by hand isn't practical since the log is over 1GB in size.

Instead, what I did was this simple grep command:

grep -c "regularexpression" access_log

In the quotes, I put the real string that I was searching for.  The C flag refers to "Count", which returns the number of times that regular expression occurs in the file.

In this case, the spider that I thought was the culprit had only downloaded 50 web pages, but the true culprit had downloaded many more.  It was using a regular browser's User-Agent so it's either a really active visitor, a macro plugin, or a spider spoofing a real browser.  Either way, if that IP address keeps up that level of activity on the site, an easy solution is to block it via IPTables.

Using GPG with RightScale and Amazon EC2

The idea behind using a service like RightScale in a cloud hosting service like Amazon's EC2 is that you script everything to bring up a server.  Because of this, you can dynamically turn servers on and off depending on your current traffic.

Where it gets difficult is when you need to install something that requires user input such as signing a GPG key.

To get around this, you'll need to come up with a solution that will allow you do finish the task without using user input. 

In the example of signing GPG keys, instead I use the --always-trust parameter like this:

gpg --always-trust -ear 'username' test.txt

This allows me to encrypt a file in a script without having to answer the Yes/No question of whether I really want to encrypt it or not.

Normally, I would sign the key to avoid this question, but signing the key requires several questions to be answered and I've yet to find a way to script their answers.

More info can be found about GPG in their GNU Privacy Handbook.


Displaying all 2 blogposts