<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>ProSPOTLIGHT Blog Posts</title>
    <link>http://www.prospotlight.com/pro/webdeveloper/blog</link>
    <description>ProSPOTLIGHT Recent Posts by Professions</description>
    <item>
      <title>Using grep to find the number of occurances in a file</title>
      <link>http://www.prospotlight.com/pro/webdeveloper/post/2009-02-12/using-grep-to-find-the-number-of-occurances-in-a-file.html</link>
      <description>&lt;p&gt;In doing some Linux server maintenance, I noticed that the server was using quite a bit more of its CPU resources than normal, yet my Analytics wasn't showing much of a spike in traffic. I have a rather large Apache access_log file, and I wanted to see how many times a particular bot spidered my web pages.&amp;nbsp; Looking through it by hand isn't practical since the log is over 1GB in size.&lt;br /&gt;
&lt;br /&gt;
Instead, what I did was this simple grep command:&lt;br /&gt;
&lt;br /&gt;
grep -c &amp;quot;regularexpression&amp;quot; access_log&lt;br /&gt;
&lt;br /&gt;
In the quotes, I put the real string that I was searching for.&amp;nbsp; The C flag refers to &amp;quot;Count&amp;quot;, which returns the number of times that regular expression occurs in the file.&lt;br /&gt;
&lt;br /&gt;
In this case, the spider that I thought was the culprit had only downloaded 50 web pages, but the true culprit had downloaded many more.&amp;nbsp; It was using a regular browser's User-Agent so it's either a really active visitor, a macro plugin, or a spider spoofing a real browser.&amp;nbsp; Either way, if that IP address keeps up that level of activity on the site, an easy solution is to block it via IPTables.&lt;/p&gt;</description>
      <guid>http://www.prospotlight.com/pro/webdeveloper/post/2009-02-12/using-grep-to-find-the-number-of-occurances-in-a-file.html</guid>
    </item>
    <item>
      <title>Using GPG with RightScale and Amazon EC2</title>
      <link>http://www.prospotlight.com/pro/webdeveloper/post/2009-01-28/using-gpg-with-rightscale-and-amazon-ec2.html</link>
      <description>&lt;p&gt;The idea behind using a service like RightScale in a cloud hosting service like Amazon's EC2 is that you script everything to bring up a server.&amp;nbsp; Because of this, you can dynamically turn servers on and off depending on your current traffic.&lt;br /&gt;
&lt;br /&gt;
Where it gets difficult is when you need to install something that requires user input such as signing a GPG key.&lt;br /&gt;
&lt;br /&gt;
To get around this, you'll need to come up with a solution that will allow you do finish the task without using user input.&amp;nbsp; &lt;br /&gt;
&lt;br /&gt;
In the example of signing GPG keys, instead I use the --always-trust parameter like this:&lt;br /&gt;
&lt;br /&gt;
gpg --always-trust -ear 'username' test.txt&lt;br /&gt;
&lt;br /&gt;
This allows me to encrypt a file in a script without having to answer the Yes/No question of whether I really want to encrypt it or not.&lt;br /&gt;
&lt;br /&gt;
Normally, I would sign the key to avoid this question, but signing the key requires several questions to be answered and I've yet to find a way to script their answers.&lt;/p&gt;
&lt;p&gt;More info can be found about GPG in their &lt;a href=&quot;http://www.gnupg.org/gph/en/manual/book1.html&quot;&gt;GNU Privacy Handbook&lt;/a&gt;.&lt;/p&gt;</description>
      <guid>http://www.prospotlight.com/pro/webdeveloper/post/2009-01-28/using-gpg-with-rightscale-and-amazon-ec2.html</guid>
    </item>
  </channel>
</rss>

