<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Nick Bernstein</title>
	<atom:link href="http://nicholasbernstein.com/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://nicholasbernstein.com/blog</link>
	<description>Defending the World From Stupidity since 1979</description>
	<lastBuildDate>Tue, 18 May 2010 07:59:01 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=abc</generator>
		<item>
		<title>How changing my homepage let me loose forty pounds and start a company.</title>
		<link>http://nicholasbernstein.com/blog/?p=107</link>
		<comments>http://nicholasbernstein.com/blog/?p=107#comments</comments>
		<pubDate>Tue, 18 May 2010 07:59:01 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[Life]]></category>
		<category><![CDATA[entrenza]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[goals]]></category>
		<category><![CDATA[motivation]]></category>
		<category><![CDATA[startup]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[work]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=107</guid>
		<description><![CDATA[How changing my homepage let me loose over forty pounds and start a company. Simple technique where you can use google docs as a home page to track goals and actions towards those goals.]]></description>
			<content:encoded><![CDATA[<p>Almost two years ago, I decided I was going to sit down and make some changes to my life. I had just finished a four year stint at microsoft, felt burnt out on tech, gotten out of shape due to working regular sixty hour weeks in front of a computer and smoking like a chimney. I sat down and thought about what it was that I wanted to achieve and came up with a list of stuff that I wanted to do. I decided to focus on accomplishing three -smoking, weight loss, and starting a company. I did several things to accomplish these goals, but one of the best, and easiest, was just changing my home page.<br />
Nowadays, when I open up my browser in the morning, I&#8217;m greated with three pages. They are as follows: (my company) entrenza.com&#8217;s beta page, gmail, and foremost &#8211; a google docs spreadsheet entitled yyyy-mm <em>goals and actions. </em>The beta page is related to one of these goals, but lets talk about the spreadsheet. It&#8217;s really simple and looks something like this:</p>
<p><span style="line-height: normal; -webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px; font-size: x-small; "><br />
</span></p>
<table border="0" width="80%">
<tbody>
<tr>
<th></th>
<th>Monday</th>
<th>Tuesday</th>
<th>Wednesday</th>
<th>Thursday</th>
<th>Friday</th>
<th>Saturday</th>
<th>Sunday</th>
</tr>
<tr>
<th>Goal 1</th>
<td>action</td>
<td>another action</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<th>Goal 2</th>
<td>action</td>
<td>another action</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<th>Goal 3</th>
<td>action</td>
<td>action</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
<p>It&#8217;s been a pretty successful system. I don&#8217;t worry about huge strides, I just worry about accomplishing, <strong>o</strong><strong>ne</strong> thing towards each goal each day. So far, &#8216;ve registered a LLC and and have regularly worked part time on a <a href="http://entrenza.com">startup</a> for over a year. We&#8217;re right about to go into beta. As for getting in shape, when I left seattle two years ago and got my California license, it listed my weight at 205lbs. I renewed last week and had 185lbs on it. Prior to leaving Seattle, you could add another 20 lbs to that. I no longer smoke.</p>
<p>I don&#8217;t know if this will be useful to anyone else, but I&#8217;ve had some success with it, and who knows, maybe it will work for you. If you try it out, I&#8217;d love to hear from you and see how it goes.</p>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=107</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More Human than Human</title>
		<link>http://nicholasbernstein.com/blog/?p=105</link>
		<comments>http://nicholasbernstein.com/blog/?p=105#comments</comments>
		<pubDate>Tue, 11 May 2010 20:25:41 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[entrenza]]></category>
		<category><![CDATA[startup]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[reputation monitoring]]></category>
		<category><![CDATA[sentiment]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=105</guid>
		<description><![CDATA[I've been working on a startup project on the side for almost a year now - focusing on pattern recognition, natural language processing stuff, and predictive statistical modeling... it's been fun.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been working on a startup project on the side for almost a year now &#8211; focusing on pattern recognition, natural language processing stuff, and predictive statistical modeling&#8230; it&#8217;s been fun. At the core, we&#8217;ve put together a language analysis engine which looks at a chunk of text and figures out if it&#8217;s positive or negative. In researching this as a problem, we&#8217;ve determined that if you take three individuals, and then have them categorize the same random text (blog, article, website, tweet, etc) they will agree 63% of the time. There&#8217;s a little bit of variance depending on what&#8217;s shown, but plus or minus a couple percentage points, is about how accurate a human is. We&#8217;ve gone through several different models in doing the predictions, and tweaked the algorithm quite a bit over multiple different versions, but we recently hit a pretty major milestone &#8211; we&#8217;re now rating articles, or our engine is, with a 70+% accuracy rate. In other words, if we rate something as positive (meaning the author felt positive about whatever they were writing) 70% of the time, the human will agree with how we rated it. ^_^</p>
<p>We&#8217;re better at determining human opinion than the average human is.</p>
<p>We&#8217;re going to be going into beta soon, on a service that will allow you to track how positive or negative your brand is, by tracking the mentions on the internet &#8211; effectively doing sentiment analysis and tracking; if you&#8217;re interested, you can sign up <a title="here" href="http://www.entrenza.com/index.php?option=com_jforms&amp;view=form&amp;id=1&amp;Itemid=67" target="_self">here</a>. You can read more about the project in general at <a title="www.entrenza.com" href="http://www.entrenza.com">www.entrenza.com</a>.</p>
<p>Thanks to Steve &amp; Jesse &amp; Ben, my co-collaborators on the project for making this happen!</p>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=105</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Would you participate in this kind of contest?</title>
		<link>http://nicholasbernstein.com/blog/?p=103</link>
		<comments>http://nicholasbernstein.com/blog/?p=103#comments</comments>
		<pubDate>Tue, 19 Jan 2010 18:04:46 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[entrenza]]></category>
		<category><![CDATA[startup]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[ai]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[contest]]></category>
		<category><![CDATA[ps3]]></category>
		<category><![CDATA[xbox]]></category>
		<category><![CDATA[xbox360]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=103</guid>
		<description><![CDATA[I&#8217;m thinking, along w/ some people I&#8217;m working on a start-up with, of running a contest to help up &#8220;train&#8221; the back end &#8220;artificial intelligence engine&#8221; which is used in our software. Here&#8217;s the gist: you would log-in to a website, and be presented w/ an &#8220;article&#8221; &#8211; this would be a blog, website, etc. [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m thinking, along w/ some people I&#8217;m working on a <a href="http://entrenza.com">start-up</a> with, of running a contest to help up &#8220;train&#8221; the back end &#8220;artificial intelligence engine&#8221; which is used in our software.</p>
<p>Here&#8217;s the gist: you would log-in to a website, and be presented w/ an &#8220;article&#8221; &#8211; this would be a blog, website, etc. you would then rate it as positive, negative, &amp; so forth. Anyone who rated 1000 articles in a month (each takes about 1-2 seconds) would be eligible to win a prize, which would either be an xbox 360, or a playstation 3.</p>
<p>So: Three questions:</p>
<li>Would you do something like this?</li>
<li>Would you be more inclined to do it for an xbox or a ps3</li>
<li>If you would not be inclined, what could we change to make you more inclined to do it?
<p>Thanks so much!</li>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=103</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Making me reach a bit</title>
		<link>http://nicholasbernstein.com/blog/?p=101</link>
		<comments>http://nicholasbernstein.com/blog/?p=101#comments</comments>
		<pubDate>Sun, 10 Jan 2010 06:10:25 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[startup]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=101</guid>
		<description><![CDATA[I wrote a little while back about some of the things my startup has gotten me to work on, but it occurred to me today, after I being grumpy and frustrated for a good chunk of the day &#8211; the firewall died (hardware), the parser&#8217;s got a memory leak and is crashing the parse server [...]]]></description>
			<content:encoded><![CDATA[<p>I <a href="http://nicholasbernstein.com/blog/?p=94">wrote</a> a little while back about some of the things <a href="http://entrenza.com">my startup</a> has gotten me to work on, but it occurred to me today, after I being grumpy and frustrated for a good chunk of the day &#8211; the firewall died (hardware), the parser&#8217;s got a memory leak and is crashing the parse server &#8211; about how working on this project has made me a better programmer, technologist, and possibly even a better person.</p>
<p>Tomorrow, after going out to brunch with some friends, and then cleaning the apartment, I&#8217;m going to come back and start doing some <a href="http://stackoverflow.com/questions/1359771/perl-memory-usage-profiling-and-leak-detection">code profiling to look for memory leaks</a>. I&#8217;ve done standard debugging stuff before, and learned the basics of code optimization and such in school back in the day, but the fact of the matter is, until I started working on the <a href="http://www.entrenza.com">startup</a> project, I mainly wrote smaller programs &#8211; utility scripts, small apps, stuff like that. I didn&#8217;t really ever get into situations where I had to think about memory leaks, or applications which would basically run constantly. It&#8217;s other stuff too &#8211; if you write a script to automate the creation of this or that, or a .net app to generate foundry server iron configs, you don&#8217;t need to put it in an architectural perspective. Now I have to think about that kind of stuff all the time. &#8220;what happens when this breaks?&#8221; &#8211; &#8220;How can I write this so I can add another server and scale out horizontally?&#8221; &#8211; I actually think about this kind of stuff when I&#8217;m coding now. I&#8217;ll scratch whole bits of things that worked because they&#8217;ll cause grief down the line.</p>
<p>I think it&#8217;s also made me more disciplined. There&#8217;s a big difference between doing your job because you know, eventually, if you slack, you&#8217;re going to get grief about it, and if you do it well you&#8217;ll get rewarded for it &#8211; all by someone else, and setting goals and following through on your own. My home page, over the years has been slashdot, popurls, rootprompt.org and a miriad of other websites&#8230; this would be on my work browser. Now, it&#8217;s (thank you firefox and chrome for having multiple tabs) a google doc spreadsheet of my goals and columns representing dates and actions. Each day I list what I did to achieve those goals. Another tab contains our ticketing system. Another tab contains our intranet site, in which there are a bunch of daily actions that I try to go through. I would never have approached a job like this if I were being paid by someone else. It took doing this on myself to realize the type of mindset and tools I would have to give myself to accomplish these things.</p>
<p>Another thing that I think has made me a better person is our twice weekly conference call &#8211; we have a very loose structure for the company &#8211; there&#8217;s no office, and we use email, ticketing, IM to communicate and conf. calls to go over progress and complete goals. We basically cover what we&#8217;ve done, and what&#8217;s next. It also is a good opportunity for me to talk on the phone with friends, who all, at this point, live in different parts of the world. Yes, it&#8217;s about a shared goal, and a project, but it&#8217;s also about keeping in touch with friends, and doing so regularly. I have traditionally been terrible at keeping in touch with people, and I think that this help me in that regard.</p>
<p>Tomorrow is going to be frustrating as hell. Don&#8217;t get me wrong, I&#8217;ll have a nice lunch, and put the work out of my mind during that part, and I&#8217;ll enjoy the call &#8211; tomorrow&#8217;s sunday, one of the days we do it &#8211; but when I start getting into phase II of the code profiling stuff, and looking for circular references and objects that aren&#8217;t being collected, I&#8217;m going to get seriously frustrated and stressed out. I&#8217;m going to hate it. But I&#8217;ll make strides towards getting it fixed. And the idea that taking on a project of this scope, and how hard it is, is making me a better coder will give me some solace. The idea that I&#8217;ve had to change my thinking in regards to where it fits in the architecture, I think has made me a better technologist, and the discipline and keeping in touch with friends has made me a better person, I hope. Yes, I&#8217;ll definitely be incredibly frustrated when half the stuff I&#8217;m trying to do ends up breaking things temporarily, but I think, in the end, it&#8217;ll be worth it.</p>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=101</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cool things I&#8217;ve gotten to play with at my startup</title>
		<link>http://nicholasbernstein.com/blog/?p=94</link>
		<comments>http://nicholasbernstein.com/blog/?p=94#comments</comments>
		<pubDate>Tue, 25 Aug 2009 17:36:26 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[entrenza]]></category>
		<category><![CDATA[learning]]></category>
		<category><![CDATA[startup]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=94</guid>
		<description><![CDATA[For the last year or so I've been working on a tech startup with some friends. In doing so, I've gotten to work with some pretty cool stuff, and I thought I'd make a list of some of them. ]]></description>
			<content:encoded><![CDATA[<p>For the last year or so I&#8217;ve been working on a tech startup with some friends. In doing so, I&#8217;ve gotten to work with some pretty cool stuff, and I thought I&#8217;d make a list of some of them. Basically, I wanted to extole the virtues of working on a startup as a great way to get real-life practice projects to work on &#8211; I have every expectation that we will have at least some success, but even if it ends up being a failure, here are some of the projects that I&#8217;ve gotten to work on:</p>
<ul>
<li>wrote an spidering application</li>
<li>setup a mysql cluster</li>
<li>setup zimbra</li>
<li>researched several virtualization options</li>
<li>setup apt-proxies</li>
<li>wrote a custom smtp daemon / parser</li>
<li>learned a boatload about bayesian analysis and other pattern recognition and predictive tools</li>
<li>setup joomla</li>
<li>setup drupal</li>
<li>setup linux natting/routing firewall</li>
<li>wrote project plans</li>
<li>managed &amp; motivated</li>
<li>learned how to incorporate</li>
<li>setup zimbra</li>
<li>setup dnsmask for dhcp/dns masqerading</li>
<li>setup bind dns &amp; replication</li>
<li>learned how to motivate people and lead weekly conference calls</li>
<li>worked on project management</li>
<li>marketing</li>
<li>sales</li>
<li>setup ldap athentication</li>
<li>setup openNAS</li>
</ul>
<p>Now &#8211; I&#8217;ve done a bunch of these things before in previous jobs, but its still good practice and there were several I hadn&#8217;t played with before. It&#8217;s a great opportunity to learn, and even if it doesn&#8217;t end up succeeding, the time I&#8217;ve put in will not have been wasted. I&#8217;ve become a better programmer, a better leader, and I&#8217;ve had a lot of fun doing it.</p>
<p>If you are interested in keeping track of the morale at your company, project, or keeping track of how positive people are about your brand, or a search term, we&#8217;re looking for beta customers. Feel free to ping me @nickbernstein on twitter if you think you might be interested.</p>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=94</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A letter to my congresswoman</title>
		<link>http://nicholasbernstein.com/blog/?p=91</link>
		<comments>http://nicholasbernstein.com/blog/?p=91#comments</comments>
		<pubDate>Thu, 14 May 2009 19:22:29 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=91</guid>
		<description><![CDATA[I thought I would put up a letter I recently wrote to my congresswoman, Maxine Waters, regarding the recent information that has come to light about torture under the Bush administration. I am writing you today in regards to the recent information about the torture which has taken place in the name of the American [...]]]></description>
			<content:encoded><![CDATA[<p>I thought I would put up a letter I recently wrote to my congresswoman, Maxine Waters, regarding the recent information that has come to light about torture under the Bush administration. </p>
<p><code><br />
I am writing you today in regards to the recent information about the torture which has taken place in the name of the American people. The atrocities, a word I do not use lightly, which have taken place *must* be investigated, and this needs to be done in a manner that is as fast and impartial as possible. I believe that there needs to be an investigation done not by Americans but by a international third party such as the UN. This cannot be perceived as a partisan attack against republicans, it is too important, but an investigation and prosecution of this torture must be undertaken. Our country is nothing without the ideals and principals under which it was formed, and this is must be dealt with. </p>
<p>Thank you for your time,<br />
Nicholas Bernstein<br />
</code></p>
<p>If you are not familiar, here is some background information:</p>
<ul>
<li><a href="http://www.dailykos.com/storyonly/2009/5/14/731112/-Seymour-Hersh:-Children-raped-on-camera-in-front-of-women-at-Abu-Ghraib.-How-bad-are-these-photos">http://www.dailykos.com/storyonly/2009/5/14/731112/-Seymour-Hersh:-Children-raped-on-camera-in-front-of-women-at-Abu-Ghraib.-How-bad-are-these-photos</a></li>
<li><a href="http://shatteredparadigm.blogspot.com/2009/02/undescribable-torture-guantanamo.html">http://shatteredparadigm.blogspot.com/2009/02/undescribable-torture-guantanamo.html</a></li>
<li><a href="http://www.huffingtonpost.com/2009/04/21/daily-show-takes-on-tortu_n_189356.html">http://www.huffingtonpost.com/2009/04/21/daily-show-takes-on-tortu_n_189356.html</a></li>
</ul>
<p>to balance it out, here is a video of baby pigs being cute: http://www.youtube.com/watch?v=FIWf_hc1_TM </p>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=91</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A quick perl script for used car research on craigslist</title>
		<link>http://nicholasbernstein.com/blog/?p=79</link>
		<comments>http://nicholasbernstein.com/blog/?p=79#comments</comments>
		<pubDate>Wed, 22 Apr 2009 07:25:12 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[Life]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[car]]></category>
		<category><![CDATA[car buying]]></category>
		<category><![CDATA[craigslist]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[prices]]></category>
		<category><![CDATA[program]]></category>
		<category><![CDATA[script]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=79</guid>
		<description><![CDATA[After carmax tried to gouge me, I thought I would test the open market and write a quick perl script to give me the average price for an item on craigslist -in this case a car. Here's how it works and the code is below. It should be really easy to modify for anyone who could use something like this:]]></description>
			<content:encoded><![CDATA[<p>So my 2001 VW jetta is getting a bit up there in the miles &#8211; it&#8217;s about 95k at the moment, and while this isn&#8217;t too much for a VW, I&#8217;ve been wanting to get a new car and I figure I should get rid of it while I can still feel good about selling it to someone else. Besides, I want to get a convertible &#8211; I live in southern California, and if it&#8217;s not the appropriate climate for one, I don&#8217;t know where is.</p>
<p>Initially I went to carmax and they offered me $2000. Yeeps! I was shocked. Could it really be worth that little? Kelly Blue book said it was at least worth $4k. So I thought I would test the open market and write a quick perl script to give me the average price for an item on craigslist. Here&#8217;s how it works and the code is below. It should be really easy to modify for anyone who could use something like this:</p>
<p><code><br />
./cl_get_prices.pl 2001+jetta</p>
<p>http://losangeles.craigslist.org/search/cta?query=2001+jetta</p>
<p>lowest:         1200<br />
highest:        9999<br />
Average:        6134.48214285714<br />
</code></p>
<p>another example where I search for porsche boxster:</p>
<p><code><br />
nick:~$ ./cl_get_prices.pl porsche+boxster</p>
<p>http://losangeles.craigslist.org/search/cta?query=porsche+boxster</p>
<p>lowest:         7600<br />
highest:        33100<br />
Average:        16865.6341463415<br />
</code></p>
<p>Anyway the code is below, and I&#8217;ll put a link to the actual perl script:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
</pre></td><td class="code"><pre class="perl" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#!/usr/bin/perl</span>
<span style="color: #0000ff;">$wget</span><span style="color: #339933;">=</span><span style="color: #ff0000;">&quot;http://losangeles.craigslist.org/search/cta?query=&quot;</span><span style="color: #339933;">;</span>
<span style="color: #0000ff;">$wget</span> <span style="color: #339933;">.=</span> <span style="color: #0000ff;">$ARGV</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">0</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
<span style="color: #000066;">print</span> <span style="color: #0000ff;">$wget</span> <span style="color: #339933;">.</span> <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #0000ff;">$html</span> <span style="color: #339933;">=</span> <span style="color: #ff0000;">`wget -q -O - $wget`</span><span style="color: #339933;">;</span>
        <span style="color: #0000ff;">@words</span> <span style="color: #339933;">=</span> <span style="color: #000066;">split</span><span style="color: #009900;">&#40;</span><span style="color: #ff0000;">' '</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">$html</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #b1b100;">foreach</span> <span style="color: #0000ff;">$word</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">@words</span><span style="color: #009900;">&#41;</span>
                <span style="color: #009900;">&#123;</span>
                 <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span> <span style="color: #0000ff;">$word</span> <span style="color: #339933;">=~</span> <span style="color: #009966; font-style: italic;">m/^\$/</span><span style="color: #009900;">&#41;</span>
                        <span style="color: #009900;">&#123;</span>
                                <span style="color: #0000ff;">$word</span> <span style="color: #339933;">=~</span> <span style="color: #009966; font-style: italic;">s/(\$|,)//g</span> <span style="color: #339933;">;</span>
                                <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span> <span style="color: #0000ff;">$word</span> <span style="color: #339933;">=~</span> <span style="color: #009966; font-style: italic;">m/^\d+$/</span> <span style="color: #009900;">&#41;</span>
                                <span style="color: #009900;">&#123;</span>
                                        <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span> <span style="color: #0000ff;">$lowest</span> <span style="color: #b1b100;">eq</span> <span style="color: #ff0000;">''</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                                                <span style="color: #0000ff;">$lowest</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$word</span> <span style="color: #339933;">;</span>
                                        <span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">elsif</span> <span style="color: #009900;">&#40;</span> <span style="color: #0000ff;">$word</span> <span style="color: #339933;">&lt;</span> <span style="color: #0000ff;">$lowest</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                                                <span style="color: #0000ff;">$lowest</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$word</span> <span style="color: #339933;">;</span>
                                        <span style="color: #009900;">&#125;</span>
                                        <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span> <span style="color: #0000ff;">$highest</span> <span style="color: #b1b100;">eq</span> <span style="color: #ff0000;">''</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                                                <span style="color: #0000ff;">$highest</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$word</span> <span style="color: #339933;">;</span>
                                        <span style="color: #009900;">&#125;</span> <span style="color: #b1b100;">elsif</span> <span style="color: #009900;">&#40;</span> <span style="color: #0000ff;">$word</span> <span style="color: #339933;">&gt;</span> <span style="color: #0000ff;">$highest</span> <span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                                                <span style="color: #0000ff;">$highest</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$word</span> <span style="color: #339933;">;</span>
                                        <span style="color: #009900;">&#125;</span>
                                        <span style="color: #0000ff;">$amt</span> <span style="color: #339933;">+=</span> <span style="color: #0000ff;">$word</span> <span style="color: #339933;">;</span>
                                        <span style="color: #0000ff;">$count</span><span style="color: #339933;">++;</span>
                                <span style="color: #009900;">&#125;</span>
                        <span style="color: #009900;">&#125;</span>
                <span style="color: #009900;">&#125;</span>
<span style="color: #0000ff;">$average</span> <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span> <span style="color: #0000ff;">$amt</span> <span style="color: #339933;">/</span> <span style="color: #0000ff;">$count</span> <span style="color: #009900;">&#41;</span> <span style="color: #339933;">;</span>
<span style="color: #000066;">print</span> <span style="color: #ff0000;">&quot;lowest:<span style="color: #000099; font-weight: bold;">\t</span><span style="color: #000099; font-weight: bold;">\t</span>$lowest<span style="color: #000099; font-weight: bold;">\n</span>highest:<span style="color: #000099; font-weight: bold;">\t</span>$highest<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #000066;">print</span> <span style="color: #ff0000;">&quot;Average:<span style="color: #000099; font-weight: bold;">\t</span>&quot;</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">$average</span> <span style="color: #339933;">.</span><span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>Here&#8217;s a link to the actual script you can <a href=”http://nicholasbernstein.com/Scripts/cl_get_prices.pl">download</a>;. If you find it useful, let me know.</p>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=79</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Playing with Bayes</title>
		<link>http://nicholasbernstein.com/blog/?p=77</link>
		<comments>http://nicholasbernstein.com/blog/?p=77#comments</comments>
		<pubDate>Sat, 04 Apr 2009 19:16:49 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[ai]]></category>
		<category><![CDATA[bayes]]></category>
		<category><![CDATA[blog parsing]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[startup]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=77</guid>
		<description><![CDATA[For the last day or so I've been playing with moving over my simple word-count-analysis of blogs to actually creating a database with manually ranked training data and extrapolating from that. There were some hiccups and I've still got to go back and replace a lot of code, but it's effectively categorizing new blog entries based on previous rankings. YAY!]]></description>
			<content:encoded><![CDATA[<p>For the last day or so I&#8217;ve been playing with moving over my simple word-count-analysis of blogs to actually creating a database with manually ranked training data and extrapolating from that. There were some hiccups and I&#8217;ve still got to go back and replace a lot of code, but it&#8217;s effectively categorizing new blog entries based on previous rankings. YAY! I&#8217;ve been using the perl Algorithm::NativeBayes cpan module, and it&#8217;s pretty great &#8211; although the the documentation is really poor. The main thing to get is that it returns a hash reference, which means you end up referring to your result as something like:</p>
<p>&lt;pre&gt;print &#8220;Sport&#8217;s ranking: \t ${$result}{sports} \n&#8221; ; &lt;/pre&gt;</p>
<p>Which, lets face it, is kinda ugly, but it&#8217;s really the only good way to do it, really. It should really be better documented, though. Aside from that, as long as you get the back-end math, you&#8217;re pretty OK. Just because you&#8217;re doing AI stuff doesn&#8217;t mean that you&#8217;re automatically familiar with how perl handles references to hashes, though. One of my to-do items is to go back and update the perldoc on it. Anyway, with that in effect I&#8217;ve gone and updated the database and I&#8217;m now able to get positivity over time. This means I&#8217;m actually getting closer to building an internet happiness index, and prediciting how &#8220;happy&#8221; the internet is as a whole. The next steps are:</p>
<ul>
<li>incorporate new bayes functions into existing codebase</li>
<li>add more sources to the rss feedlist scripts</li>
<li>optimize the blogparser</li>
<li>put a nice (fusioncharts?) front end together</li>
<li>add more hardware for doing the catagorization</li>
<li>get more people to do more training data</li>
<li>???</li>
<li>profit!</li>
</ul>
<p>Actually, the &#8220;???&#8221; is pretty well defined, but honestly, this project will have been fun even it it doesn&#8217;t make a dime. Anyway, one step closer.</p>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=77</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Dear Bestbuy, why don&#8217;t you value me as a customer?</title>
		<link>http://nicholasbernstein.com/blog/?p=72</link>
		<comments>http://nicholasbernstein.com/blog/?p=72#comments</comments>
		<pubDate>Tue, 17 Mar 2009 00:14:28 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=72</guid>
		<description><![CDATA[On my birthday, I drove from one store to another in the boston area trying to find a dell mini 9 netbook. I eventually did, the last one they had, which was an open box return, mid wipe. I purchased it @ full price, and even had to sign something saying I was OK with [...]]]></description>
			<content:encoded><![CDATA[<p>On my birthday, I drove from one store to another in the boston area trying to find a dell mini 9 netbook. I eventually did, the last one they had, which was an open box return, mid wipe. I purchased it @ full price, and even had to sign something saying I was OK with it not having the OS/Drivers installed.</p>
<p>I was fine with this, I was planning on installing ubuntu on it anyway. The problem was that w/in a few days the &#8220;p&#8221; and &#8220;o&#8221; keys had gone dead. No problem, I think, I&#8217;ll just take it back. I had left the box at my GF&#8217;s house, so I asked her to ship it to my apartment. Once I got home to california, and gotten the box, I took it back to my closest best buy in el segundo, ca. and <em>they refused to do an <strong>exchange</strong></em><strong>.</strong> An exchange. I don&#8217;t want my money back, or anything fancy &#8211; they keys don&#8217;t work, I just wanted to swap it out, no data transfer even, I wiped it for them. I cajoled, I bargained, I tried to explain what an example of bad customer service this was, but to no avail. The manager wouldn&#8217;t even come out to see me. ( !! )</p>
<p>In the last year I have purchased (off the top of my head):</p>
<ul>
<li>a laptop (alum. macbook)</li>
<li>several mice</li>
<li>multiple laptop cases</li>
<li>two network routers</li>
<li>several video adaptors</li>
<li>a printer</li>
<li>two 160g usb hd drives</li>
<li>1 terrabyte usb hard drive</li>
<li>several video games</li>
<li>ethernet cables out the wazoo</li>
<li>a gigabit switch</li>
<li>a non-gigabit switch</li>
<li>extended power strips</li>
</ul>
<p>I&#8217;m not going to enumerate everything I&#8217;ve purchased, but I buy a *lot* of computer equiptment. I can&#8217;t imagine why bestbuy would have a policy in place that would be so strict that it would willingly loose a customer over an *exchange* &#8211; I mean, this is exactly the type of service that drove compusa out of business. I spent a bunch of money, value me as a customer. If anyone from bestbuy reads this, this is the resolution I would like: contact the bestbuy in el segundo, and ask them to exchange my netbook. That&#8217;s it. Like for like. Otherwise, there&#8217;s a fry&#8217;s electronics a block away, and I&#8217;m sure they&#8217;d be happy to take my money.</p>
<p>-Nick Bernstein.</p>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=72</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An alternative approach to snaprestore rollbacks for virus outbreaks</title>
		<link>http://nicholasbernstein.com/blog/?p=64</link>
		<comments>http://nicholasbernstein.com/blog/?p=64#comments</comments>
		<pubDate>Fri, 16 Jan 2009 20:44:19 +0000</pubDate>
		<dc:creator>Nick Bernstein</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[disaster recovery]]></category>
		<category><![CDATA[flexclone]]></category>
		<category><![CDATA[it]]></category>
		<category><![CDATA[netapp]]></category>
		<category><![CDATA[snap restore]]></category>
		<category><![CDATA[virus outbreak]]></category>

		<guid isPermaLink="false">http://nicholasbernstein.com/blog/?p=64</guid>
		<description><![CDATA[overview Netapp&#8217;s snap restore product is a fantastic tool in a storage admin&#8217;s arsenal. It works well. It&#8217;s fast, and it doesn&#8217;t need to &#8220;restore&#8221; data, it just makes a previous snapshot the active file system. That said, I keep seeing the same scenario put forth by netapp and various other folks as a use [...]]]></description>
			<content:encoded><![CDATA[<h2>overview</h2>
<div class="wp-caption alignright" style="width: 410px"><a href="http://www.cs.nmsu.edu/~pfeiffer/classes/474/notes/file.gif"><img title="Figure 1" src="http://www.cs.nmsu.edu/~pfeiffer/classes/474/notes/file.gif" alt="Figure 1" width="400" height="626" /></a><p class="wp-caption-text">Figure 1</p></div>
<p>Netapp&#8217;s snap restore product is a fantastic tool in a storage admin&#8217;s arsenal. It works well. It&#8217;s fast, and it doesn&#8217;t need to &#8220;restore&#8221; data, it just makes a previous snapshot the active file system. That said, I keep seeing the same scenario put forth by netapp and various other folks as a use case for snaprestore, and it&#8217;s one where snaprestore would be my second choice. The scenario is this: We&#8217;re taking hourly snapshots over the course of a day. A virus breaks out. The files on our cifs shares have been compromised. We&#8217;re sure that our current data is infected, and we know that at least some of the data was infected an hour ago. Two hours ago is uncertain, but we are sure that the virus hadn&#8217;t broken out three hours ago, so we know the file system at this point is clean. The recommendation you always hear is to use snap restore to roll back the file system to three hours ago, and you&#8217;re now clean and not infected. Unfortunately, as a side effect, we&#8217;ve lost all the data after that point &#8211; which, I should mention, is the intended result: if we hadn&#8217;t our files would still be infected, right? My solution is to instead clone the volume using a really neat tool called flexclone.</p>
<h2>some snapshot basics</h2>
<p>I&#8217;d like to suggest an alternate method, but before I get into it, it probably makes sense to talk about how snapshots work. The idea is very simple. Each file on a file system has an <a href="http://en.wikipedia.org/wiki/Inode">inode</a>. An inode contains information about a file &#8211; the owner, when it was created, etc. If you&#8217;re on a windows box, and you view the &#8220;properties&#8221; of a file, you&#8217;re seeing the information contained in an inode. A hard drive is made up of blocks of data, and another thing that an inode does is point to the blocks that make up the data for a given file. If a file is big, the inode won&#8217;t actually point to the data blocks themselves, it will point to indirect blocks which then point to data blocks ( inode -&gt; indirect level 1 -&gt; data ). If a file is <em>really</em> big we can have multiple levels of indirection, like in <em>figure 1</em>. A snapshot is, for all intents and purposes, a copy of just that top level inode block. This block then contains pointers to the indirect blocks, and those indirect blocks to the data blocks. ( Is this fun yet? ) Once we&#8217;ve made a copy of the inode (parent) the indirect and data blocks (children) are effectively frozen, since a block cannot be changed unless it&#8217;s parent, or parents (original inode and now the snapshot) agree. The way we handle modifications to files in the netapp world is to write those changes to a special reserved section of the volume called the &#8220;snapshot reserve&#8221; (apt) and to update the original inode to point to those blocks. It&#8217;s pretty slick, really.</p>
<h2>flexclone</h2>
<p>In order to understand how this alternative solition works, we need to understand how flexclone works. Flexclone &#8211; to over simplify things &#8211; is a writeable snapshot. Actually, it&#8217;s really not that much of an over simplification, really. If we go back to our snapshot basics where we talked about how each file has an inode which points to data blocks, I should mention that under the hood, everything on a filesystem is actually just, well, a file&#8230; even directories. A directory is just a file that contains a list of file names and where the inodes for those files are, that&#8217;s the data in that &#8220;file&#8217;s&#8221; data blocks. The top of any volume (think of this as a drive) is a directory, and that directory in turn, has an inode. This one we call the &#8220;root inode&#8221;. When we take a snapshot, of volume, this is the inode we&#8217;re basing our snapshot of. So, what we end up with is: [ root inode ] [ snapshot copy of root inode ]. What we can do is add one more copy to make a &#8220;clone&#8221; of the volume. this leaves us with: [root inode ] [ snapshot copy of inode] [clone copy of inode]. Now we can still write to our normal volume by writing to that snapshot reserve that we talked about, and what we do when we create a flex clone is associate a *new* snapshot reserve with the clone. The cool thing is <em>initially the clone takes up <span style="text-decoration: underline;">no space</span></em>. Ok, that&#8217;s a lie, it takes up a coupe kilobytes, but we&#8217;re talking storage systems here, a couple of kilobytes <em>is</em> nothing. This is really useful for things like backing up a database, or doing QA -theres a whole bunch of use cases, such as, oh, avoiding the unneccessary dataloss of using snap restore in the event of a virus outbreak.</p>
<h2>the actual solution</h2>
<p>Ok, now that we&#8217;ve brushed up on our netapp basics, lets review our problem:</p>
<ul>
<li>we&#8217;ve taken hourly snapshots</li>
<li>our active file system is infected with a virus, we&#8217;ll assume the volume name is <strong>vol1</strong></li>
<li>so is our previous hourly snapshot and probably the one before that</li>
<li>the normal recommeded solution would be to snap restore to hourly.2 (three hours previous) before the virus broke out</li>
</ul>
<p>Using this method would mean that we would lose three hours worth of data. That&#8217;s not good. Preventing data loss is the whole job of a storage admin, as far as I&#8217;m concerned. So, an alternate method:</p>
<ol>
<li>take cifs offline, using the <strong>cifs terminate</strong> command</li>
<li>on your admin host copy the file <strong>/vol/vol0/etc/cifsconfig_share.cfg</strong> -&gt; cifsconfig_share.cfg.YYMMDD.pre-virus.bak</li>
<li>make a clone of vol1 called &#8220;vol1_clone&#8221; with the command: <strong>vol clone create vol1_clone -b vol1 hourly.3</strong></li>
<li>open <strong>/vol/vol0/etc/cifsconfig_share.cfg</strong> and do a find and replace, changing every instance of <strong>vol1</strong> to <strong>vol1_clone</strong></li>
<li>run the command <strong>cifs restart</strong></li>
<li>run the command <strong>cifs shares -add oldvol1 /vol/vol1</strong></li>
<li>lock it down with <strong>cifs access &lt;your user&gt; &#8220;Full Control&#8221;</strong></li>
</ol>
<h2>So what did this do?</h2>
<p>We quarantined the file system by stopping cifs and removing client access. Then we created a new clone of the volume that had been infected off of the snapshot we new was clean. We updated all of our shares using a quick find and replace giving our users access to their data, so they can get back to work. We also exposed our old volume via a new cifs share, which we&#8217;ve restricted only to our user, who theoretically knows better than to muck with infected files. So why did we do all this work? Later, when new virus definitions come out, we can scan that volume over night and have our anti-virus program go clean up our mucky infected files. Once we do, we can open up the share, send out an email, and give our users the ability to recover any important documents that had been created during those three hours, saving extra work, and potentially saving compliance headaches.</p>
<p>There is a final part of this, in which we split off the clone using the <strong>vol split</strong> command. I would note &#8211; you will (at least temporarally) need to have enough free space on your aggregate to contain both of the volumes, w/o any space savings. Once your split is finished, you can do a <strong>vol offline vol1 ; vol destroy vol1 </strong>to get rid of the old volume and free up that space.</p>
<p>I think this is a better solution to the problem, and one that&#8217;s much more elegant than using a snap restore. I&#8217;d love to hear any feedback or improvments to this process, so if you found this useful, or can think of a better way, please let me know.</p>
]]></content:encoded>
			<wfw:commentRss>http://nicholasbernstein.com/blog/?feed=rss2&amp;p=64</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
