<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Yale Daily News Online</title>
	<atom:link href="http://online.yaledailynews.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://online.yaledailynews.com</link>
	<description>About the online operations of the Yale Daily News.</description>
	<pubDate>Sat, 07 Jun 2008 20:13:41 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>Engine What?</title>
		<link>http://online.yaledailynews.com/2008/06/07/engine-what/</link>
		<comments>http://online.yaledailynews.com/2008/06/07/engine-what/#comments</comments>
		<pubDate>Sat, 07 Jun 2008 20:00:16 +0000</pubDate>
		<dc:creator>Robert Baskin (Online Director)</dc:creator>
		
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/?p=17</guid>
		<description><![CDATA[So finally we come to the Server header, which I have alluded to previously. When you make a request to our server, we send back many &#8220;Headers&#8221; for each request before we send back the actual content you requested. They include things like Size and Expires headers. The best way to see all of the [...]]]></description>
			<content:encoded><![CDATA[<p>So finally we come to the Server header, which I have <a href="http://online.yaledailynews.com/2008/05/03/slice-and-dice/">alluded to</a> <a href="http://online.yaledailynews.com/2008/05/15/its-the-little-things-in-life/">previously</a>. <span id="more-17"></span>When you make a request to our server, we send back many &#8220;Headers&#8221; for each request before we send back the actual content you requested. They include things like Size and <a href="http://online.yaledailynews.com/2008/03/31/making-the-site-faster/">Expires headers</a>. The best way to see all of the headers, and much more information about how your browser loads our page, is with the <a href="http://www.getfirebug.com/">Firebug extensio</a>n for <a href="http://www.getfirefox.com">Firefox</a>. The Net tab will show you each request as it&#8217;s made to our server - how long it took, the headers it returned and the content we sent back. Examples of requests are for the main HTML page, each CSS and JS file, images and Flash objects.</p>
<p>Anyway, if you take a quick look, you&#8217;ll notice that our Server header is &#8220;nginx/0.5.26&#8243;. (Note: Don&#8217;t look on this blog; it&#8217;s not hosted on our production server. Look on <a href="http://yaledailynews.com">YaleDailyNews.com</a>.) Anyway, you might notice that&#8217;s not one of the two servers that run the vast majority of Web sites out there on the Internet - <a href="http://httpd.apache.org/">Apache</a> and <a href="http://www.iis.net/default.aspx?tabid=1">Internet Information Services</a> (IIS). We used to send back Apache headers, but what is nginx?</p>
<p>To quote the <a href="http://wiki.codemongers.com/Main">English version of its documentation</a> (it&#8217;s a <a href="http://nginx.net/">Russian product</a>), &#8220;Nginx (pronounced &#8216;engine x&#8217;) is a free, open-source, high-performance HTTP server and reverse proxy, as well as an IMAP/POP3 proxy server.&#8221; Essentially, nginx is a FAST Web server. It&#8217;s great at handling requests for static files - CSS, JS, images, etc. Its downside is that it&#8217;s not very good at serving requests for dynamic pages: anything that involves parsing server-side code, which is PHP in our case.</p>
<p>So we decided to use nginx in what&#8217;s called a <a href="http://blog.kovyrin.net/2006/05/18/nginx-as-reverse-proxy/">reverse-proxy</a>. When you make a request to our Web server, nginx handles that request. It checks to see if the request is for a static file. If it is, it serves the file. That&#8217;s much faster than Apache! Under the old system, Apache was serving both dynamic requests, which it&#8217;s good at, and static requests, which in a LAMP configuration it is not very good at. In order to serve simple CSS files, it had to use its heavy PHP-interpreter-laden process to serve that request, which is quite memory-intensive and relatively slow. Nginx is much faster.</p>
<p>But what about dynamic requests? After checking to see if the request is static and serving it if so, nginx passes on all other requests to Apache, which is better at handling PHP requests. Apache is running on our server on another port, so nginx passes the request to localhost on that port. Apache sees that request, generates the page using PHP and passes the content back to nginx, which sends it back to you.</p>
<p>I wish we had better numbers, but we didn&#8217;t really do as much benchmarking as we should have. One of our main bottlenecks is memory (especially now that we&#8217;re doing view caching, which will be the subject of my next post), and nginx uses much less memory than Apache, so we&#8217;re able to serve more requests without overloading our server. I&#8217;d estimate a 20 percent increase in requests per second.</p>
<p>The other alternative we considered was moving our static assets to <a href="http://www.amazon.com/S3-AWS-home-page-Money/b?ie=UTF8&amp;node=16427261">Amazon S3</a>. However, in order to do so, we would have to do some reworking in our code to point to the right images location, we&#8217;d have to develop a deployment process to synchronize to S3 and we&#8217;d have to pay for Amazon&#8217;s bandwidth costs. We may revisit the idea in the future though, as well as considering moving to <a href="http://aws.amazon.com/ec2">Amazon EC2</a> for hosting as well. However, for the time being, nginx was an easy and helpful solution.</p>
<p>Sorry about the delay. Posts should come more regularly now. I&#8217;m aiming for one every week, but don&#8217;t hold me to it!</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/06/07/engine-what/feed/</wfw:commentRss>
		</item>
		<item>
		<title>It&#8217;s the Little Things in Life</title>
		<link>http://online.yaledailynews.com/2008/05/15/its-the-little-things-in-life/</link>
		<comments>http://online.yaledailynews.com/2008/05/15/its-the-little-things-in-life/#comments</comments>
		<pubDate>Thu, 15 May 2008 18:00:39 +0000</pubDate>
		<dc:creator>Robert Baskin (Online Director)</dc:creator>
		
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/?p=16</guid>
		<description><![CDATA[I know I said my next post would have to do with our new Server header that we&#8217;re sending back, but I just wanted to get this in while it&#8217;s fresh in my mind. And plus, MySQL optimization is always worth talking about!
There are several places around our site where we use MySQL&#8217;s DATEDIFF function, [...]]]></description>
			<content:encoded><![CDATA[<p>I know <a href="http://online.yaledailynews.com/2008/05/03/slice-and-dice/">I said</a> my next post would have to do with our new Server header that we&#8217;re sending back, but I just wanted to get this in while it&#8217;s fresh in my mind. And plus, MySQL optimization is always worth talking about!<span id="more-16"></span></p>
<p>There are several places around our site where we use MySQL&#8217;s <a href="http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_datediff">DATEDIFF function</a>, which calculates the number of days between two given times. One example is the <a href="http://www.yaledailynews.com/#mostpopularbox">&#8220;Most Popular&#8221; box</a>, which shows the most popular viewed, e-mailed and commented stories on our site. That works by retrieving stories posted within the last week sorted by the relevant count (hitcount, emailcount, or commentcount). If there aren&#8217;t 10 stories within the past seven days, it will go to 14 days to try and fill the box, then 21 and so on.</p>
<p>The query we were running had the following WHERE clause: DATEDIFF(NOW(), articles.dateposted) &lt; 7. That got back only articles posted in the last week. That worked like we wanted it to, but it was a bit slow. Instead of taking just a couple milliseconds, it would take about half a second. Remember that it has to do at least three queries to get all three types, and if it doesn&#8217;t find enough articles within the past week, it has to run the queries again with 14 days, then 21 etc. That really only happens over breaks though, when we&#8217;re not publishing daily. Regardless, even three queries taking half a second is a lot.</p>
<p>The problem was mitigated by the fact that we cache the results of our database calls. So we actually only generate that &#8220;Most Popular&#8221; box every couple hours, rather than for every request. So did I actually need to optimize it? Well why not! I enjoy do this kind of stuff, and it looked like a quick fix. I was reading some <a href="http://www.jpipes.com/index.php?/archives/231-Join-fu-The-Art-of-SQL-Tuning.html">slides called Join-Fu</a> by <a href="http://jpipes.com/">Jay Pipes</a>, who works for MySQL. Slide 17 was particularly relevant. He was talking about date calculations and optimizing them. I thought of our &#8220;Most Popular&#8221; box and set about optimizing it.</p>
<p>The problem with our query was that it could not take advantage of two important parts of what makes MySQL fast: indexes and query caching. First, apparently queries with functions operating on a column means it can&#8217;t use an index. (Indexes are kind of a large topic to get into in this post, so I&#8217;ll just direct you to <a href="http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html">some reading on them</a>.) Not using indexes means the query runs much more slowly. To solve this problem, I followed his advice and rewrote the WHERE condition to be &#8220;articles.dateposted &gt; NOW() - INTERVAL 7 DAY,&#8221; using MySQL&#8217;s INTERVAL syntax. The time taken to execute the query dropped to around 100 milliseconds - 1/5 of what it was!</p>
<p>But because the query still had a function in it, MySQL has to evaluate the result when the query is executed and therefore couldn&#8217;t cache the results of the query. Again, query caching is a bit outside the scope of this post (<a href="http://dev.mysql.com/doc/refman/5.0/en/query-cache.html">more reading on it</a>), but at the basic level, MySQL caches the results of queries it executes, so the next time they&#8217;re run it serves the cached result instead of running the query again. I wanted to make our query use the query cache instead of having to be run every time. To do that, I just replaced the NOW() in the WHERE condition with today&#8217;s date as generated by PHP&#8217;s date function. So now the WHERE condition looks like &#8220;articles.dateposted &gt; &#8221; . date(&#8217;Y-m-d&#8217;) . &#8221; - INTERVAL 7 DAY&#8221;. That query takes about 100 milliseconds to run the first time, but one millisecond or less to run each subsequent time. Remember: MySQL&#8217;s query cache persists beyond requests, so it&#8217;s cached for as long as MySQL is running (or until MySQL expires it), not just until the end of PHP&#8217;s life at the end of the request.</p>
<p>Was the optimization worth it? Probably not, but it&#8217;s always nice to improve the speed of frequently run queries by orders of magnitude.</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/05/15/its-the-little-things-in-life/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Slice and Dice</title>
		<link>http://online.yaledailynews.com/2008/05/03/slice-and-dice/</link>
		<comments>http://online.yaledailynews.com/2008/05/03/slice-and-dice/#comments</comments>
		<pubDate>Sat, 03 May 2008 18:36:10 +0000</pubDate>
		<dc:creator>Robert Baskin (Online Director)</dc:creator>
		
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/?p=15</guid>
		<description><![CDATA[So in my last post, I promised I would write up some of the work we planned to do to improve performance when we had a chance to actually implement the proposed changes. Well, we&#8217;ve done quite a bit of work in the past week. I&#8217;ll talk about some of the changes and leave some [...]]]></description>
			<content:encoded><![CDATA[<p>So in my last post, I promised I would write up some of the work we planned to do to improve performance when we had a chance to actually implement the proposed changes. Well, we&#8217;ve done quite a bit of work in the past week. I&#8217;ll talk about some of the changes and leave some for future posts.<span id="more-15"></span></p>
<p>First things first: We changed hosts. For the past year or so, we&#8217;ve been hosted on a dedicated box at <a href="http://softlayer.com">Softlayer</a>. I have nothing but good things to say about Softlayer - fair prices (a little higher than <a href="http://layeredtech.com">Layeredtech</a>, but competitive), good features, responsive support, etc. etc. But the server we purchased was the first dedicated server any of us had ever managed, and crud had started to accumulate and make things very difficult.</p>
<p>To begin with, we had Softlayer set up our box with <a href="http://www.cpanel.net/">CPanel</a>, which is sort f the industry standard when it comes to server management-control panels. The problem was that as we started to become more competent, we began changing things manually, and CPanel couldn&#8217;t stay in sync and thus was rendered effectively useless. I also had a sneaking suspicion that some of its &#8220;helpful&#8221; services that run to keep your server maintained were actually slowing us down. I definitely thought that it was time to move to a new<br />
server, one where we could start fresh.</p>
<p>After reading around (the <a href="http://webhostingtalk.com">Web Hosting Talk forums</a> are invaluable), I decided <a href="http://slicehost.com">Slicehost</a> would be a good place for us. (We also considered <a href="http://linode.com">Linode</a>.) As I alluded to in my previous post, getting a VPS offered several benefits over a dedicated server. First, Slicehost takes daily and weekly snapshot backups of our server, meaning we could restore to a working server easily if something every happened. Second, we could upgrade the size of our &#8220;slice&#8221; if we ever experienced a surge in traffic. I suggested it would take &#8220;a matter of minutes,&#8221; but as <a href="http://online.yaledailynews.com/2008/04/25/the-day-the-music-died/#comments">neodude pointed out</a>, it actually takes about 10-15 minutes. Regardless, being able to upgrade our server that easily is a very helpful feature.</p>
<p>Additionally, <a href="http://articles.slicehost.com">Slicehost&#8217;s tutorials</a> for setting up a LAMP server are excellent. We decided to install <a href="https://wiki.ubuntu.com/GutsyGibbon">Ubuntu Gutsy</a> on our slice, and using the magic of <a href="http://en.wikipedia.org/wiki/Advanced_Packaging_Tool">apt-get</a>, we had everything we needed set up in just a couple hours (much of that learning how to do some fairly basic things). Most importantly, the server was fresh and entirely configured by us: We could make changes easily and keep everything organized how we wanted to. It is great.</p>
<p>If you have some Linux competency, you should consider setting up your own server. With package managers these days, you don&#8217;t need to worry about compiling, just configuring, and even the defaults are generally suitable, so it&#8217;s very easy.</p>
<p>There are other fantastic features of Slicehost. Although not actually run by the company, the members of the <a href="irc://irc.freenode.net/slicehost">Slicehost IRC channel </a>have been extremely helpful. Also, Slicehost provides a SSH console in their Web manager that you can use to log in to the server if SSH stops working for some reason. (Like, for example, forgetting to leave your SSH port open in the iptables.)</p>
<p>We&#8217;ve made some other tweaks to make our Web site run faster. (Hint: look at the &#8220;Server&#8221; header we&#8217;re sending back.) The goal is to survive a Drudging or something similar. I will have more information in future posts about some more of what we&#8217;re doing.</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/05/03/slice-and-dice/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The Day the Music Died</title>
		<link>http://online.yaledailynews.com/2008/04/25/the-day-the-music-died/</link>
		<comments>http://online.yaledailynews.com/2008/04/25/the-day-the-music-died/#comments</comments>
		<pubDate>Fri, 25 Apr 2008 05:10:21 +0000</pubDate>
		<dc:creator>Robert Baskin (Online Director)</dc:creator>
		
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/?p=14</guid>
		<description><![CDATA[Last Thursday around 8 a.m., I received a phone call from our Online Editor informing me that the Web site was down, and politely asking if I could get it back online as soon as possible. It seems we had been linked to by the Drudge Report (to our story on Aliza Shvarts), and the [...]]]></description>
			<content:encoded><![CDATA[<p>Last Thursday around 8 a.m., I received a phone call from our Online Editor informing me that the Web site was down, and politely asking if I could get it back online as soon as possible. It seems we had been linked to by the <a href="http://drudgereport.com">Drudge Report</a> (to our <a href="http://http//www.yaledailynews.com/articles/view/24513">story on Aliza Shvarts</a>), and the deluge of hits had knocked us off the Internet. Over the course of Thursday and Friday, we were linked to (often simultaneously) by Drudge, <a href="http://digg.com">Digg</a>, <a href="http://reddit.com">Reddit</a>, <a href="http://gawker.com">Gawker</a>, <a href="http://perezhilton.com">Perez Hilton</a>, <a href="http://foxnews.com">Fox News</a>, <a href="http://msnbc.com">MSNBC </a>and others. On Thursday, we received 12 times our average amount of traffic. In this post, I&#8217;ll go over some of the technical details of what happened over that two-day period, and some changes we&#8217;re implementing to make things better.<span id="more-14"></span></p>
<p>The first goal was to get some page online, instead of just appearing down. Not loading anything was the worst thing that could have happened. Fortunately, we have a check built-in very early in our application to see if it can connect to MySQL. All we had to do was change the MySQL password to be incorrect, and the application starting redirecting to our error page. That&#8217;s fairly simple for the server to handle — all it has to do is try and connect to MySQL, fail, and then redirect to a static HTML page. (We didn&#8217;t shut down MySQL entirely because it also serves some other sites besides yaledailynews.com, and we wanted those to continue working as much as possible.) But an error page definitely isn&#8217;t ideal, and if we didn&#8217;t get the site back up quickly, Drudge would drop its link. We would lose the traffic, and it would be less likely to link to us again.</p>
<p>The next step was to get something showing up for Drudge users. We copied the text of the article and pasted it into a static HTML file. Then we had our application redirect all users coming from Drudge to that static HTML file with the popular story. Our server was able to handle that. But it was erroring for everybody else, which was no good. So we put a link on the error page to the static version of the story, which seemed like a good idea at the time. However, as our editors informed us, we didn&#8217;t want to seem like we were blowing the story up unnecessarily, so we removed the link and went back to the drawing board.</p>
<p>The story was continuing to explode. Our static page was holding up, but we really wanted to get all of our stories back online. We had implemented view caching (I will blog about this in the future), and the pages were being cached in memory in <a href="http://xcache.lighttpd.net/">XCache</a>. However, I noticed that we were hitting our memory limit as Apache processes were spawned. Our caching system can fall back to file-based caching if we tell it to, so I figured we could try that. I allowed only my IP address to be able to access the main site, and clicked on a couple pages to prime the caches. This is important — if we started our site up with an empty cache, the server would overload trying to fill the cache as people started visiting. I primed the cache, then opened things up.</p>
<p>Wonder of wonders, miracle of miracles, the site stayed alive. It was a little slow, but it was going. We kept users with Drudge in their referrer going to the static site. Drudge&#8217;s traffic is overwhelming — it dwarfed all of the other referrers, which are not small. With this setup, we managed to stay alive for most of the rest of the time.</p>
<p>On Friday, we hit the front page of Digg, Fox News and Drudge at the same time for our follow-up stories. I made static pages for each of those stories, and routed all traffic just to those specific URLs to the static pages. With those optimizations, we made it through the week, and over the weekend traffic subsided to manageable levels.</p>
<p>Going forward, we are implementing some changes very soon. First of all, we are going to move from our dedicated host to a virtual private server. We&#8217;ll be able to rid ourselves of CPanel, which is a waste of resources if you can manage a LAMP server adequately by yourselves. Also, we&#8217;ll be able to resize and get more resources in a matter of minutes, rather than hours, in case we get another spike. Additionally, we&#8217;re considering moving some of our static files (CSS and background images, JavaScript files) to Amazon S3. That will result in faster downloads for our visitors, and Apache won&#8217;t have to serve as many requests. Even though it&#8217;s fast with static files, it can only help.</p>
<p>So that&#8217;s what happened. As we implement some further changes, I will blog more about them. The goal is to be able to survive a Drudging or Digging or any major linkage. We are close to getting there, but we have some work to do.</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/04/25/the-day-the-music-died/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Apology for Duplicate Emails</title>
		<link>http://online.yaledailynews.com/2008/04/20/apology-for-duplicate-emails/</link>
		<comments>http://online.yaledailynews.com/2008/04/20/apology-for-duplicate-emails/#comments</comments>
		<pubDate>Mon, 21 Apr 2008 04:39:54 +0000</pubDate>
		<dc:creator>Robert Baskin (Online Director)</dc:creator>
		
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/?p=13</guid>
		<description><![CDATA[Dear YaleDailyNews.com subscribers,
Some of you may have erroneously received many duplicate emails from the Yale Daily News in the past few minutes. It was a result of server testing we were conducting to improve our headlines email service, and it went awry. I want to apologize for the error, and assure you that we are [...]]]></description>
			<content:encoded><![CDATA[<p>Dear YaleDailyNews.com subscribers,</p>
<p>Some of you may have erroneously received many duplicate emails from the Yale Daily News in the past few minutes. It was a result of server testing we were conducting to improve our headlines email service, and it went awry. I want to apologize for the error, and assure you that we are taking steps to prevent it from happening again. If you have any questions, please feel free to contact me.</p>
<p>Thank you for visiting the Yale Daily News online.</p>
<p>Robert Baskin<br />
Online Director, Yale Daily News<br />
webmaster@yaledailynews.com</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/04/20/apology-for-duplicate-emails/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Two New Features at the Yale Daily News</title>
		<link>http://online.yaledailynews.com/2008/04/12/two-new-features-at-the-yale-daily-news/</link>
		<comments>http://online.yaledailynews.com/2008/04/12/two-new-features-at-the-yale-daily-news/#comments</comments>
		<pubDate>Sat, 12 Apr 2008 13:00:24 +0000</pubDate>
		<dc:creator>Andrew Mangino (Editor-in-Chief)</dc:creator>
		
		<category><![CDATA[Content]]></category>

		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/?p=12</guid>
		<description><![CDATA[Dear YaleDailyNews.com subscribers,
I&#8217;m pleased to announce the introduction this week of two interactive additions to YaleDailyNews.com. Even if each serves a &#8220;niche&#8221; interest, I hope one or the other sparks in you a related idea for how the Yale Daily News can continue to enhance its online presence.
For decades, Yale College Council elections have dominated [...]]]></description>
			<content:encoded><![CDATA[<p>Dear YaleDailyNews.com subscribers,</p>
<p>I&#8217;m pleased to announce the introduction this week of two interactive additions to YaleDailyNews.com. Even if each serves a &#8220;niche&#8221; interest, I hope one or the other sparks in you a related idea for how the Yale Daily News can continue to enhance its online presence.<span id="more-12"></span></p>
<p>For decades, Yale College Council elections have dominated Yale College life for at least one week each spring. But this year, the regular inundation of commentary and campaigning is set to amplify at least tenfold with the introduction of <a href="http://www.yaledailynews.com/blogs/yaledecides08">&#8220;Yale Decides 2008,&#8221;</a> a live blog established to guide the student population through next week&#8217;s race with punditry, predictions, interviews, video and analysis.</p>
<p>Second, in attempt to promote a more transparent newsroom, we have launched <a href="http://www.yaledailynews.com/articles/view/24314">&#8220;Inside 202 York,&#8221;</a> a forum in which readers can prod editors &#8212; the News editors in this week&#8217;s case &#8212; with questions, however friendly or hostile, about recent coverage or general practices.</p>
<p>Thank you, as always, for your ideas and support.</p>
<p>Sincerely,<br />
Andrew Mangino<br />
Editor in Chief<br />
Yale Daily News</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/04/12/two-new-features-at-the-yale-daily-news/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Tags, Tags, Tags</title>
		<link>http://online.yaledailynews.com/2008/04/03/tags-tags-tags/</link>
		<comments>http://online.yaledailynews.com/2008/04/03/tags-tags-tags/#comments</comments>
		<pubDate>Fri, 04 Apr 2008 02:51:12 +0000</pubDate>
		<dc:creator>Henry Corrigan-Gibbs (Director of Online Initiatives)</dc:creator>
		
		<category><![CDATA[Content]]></category>

		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/?p=10</guid>
		<description><![CDATA[In class today I was sitting behind another student who appeared (to the professor, at least) to be furiously typing notes into his laptop.  From my perch behind him, I could see that this kid was browsing the Yale Daily News Web site.
While I was perplexed that he bothered to drag himself out of bed [...]]]></description>
			<content:encoded><![CDATA[<p>In class today I was sitting behind another student who appeared (to the professor, at least) to be furiously typing notes into his laptop.  From my perch behind him, I could see that this kid was browsing the <a href="http://www.yaledailynews.com/">Yale Daily News Web site</a>.</p>
<p>While I was perplexed that he bothered to drag himself out of bed for a 9 a.m. lecture that he wasn&#8217;t even going to listen to, I was ecstatic that he chose to spend his time looking at our Web site.</p>
<p>Competition for viewers online is fierce so once we get a reader on our site we try to do everything in our means to keep them from browsing to some <a href="http://www.metacafe.com/watch/865433/dancing_monkeys/">other entertaining Web site</a>.  An extensive <a href="http://www.yaledailynews.com/tags/">tagging system</a> is one of the prime ways we&#8217;ve come up with to encourage people to explore the site instead of leaving after reading just <a href="http://yaledailynews.com/articles/view/2759">one popular article</a>.  By providing links to other content that users might be interested in, the site tries to guess what interests each viewer.<span id="more-10"></span></p>
<p>For example, when a viewer is reading an article about <a href="http://yaledailynews.com/articles/view/23750">Richard Levin</a>, they will be given links to related content pages about the <a href="http://yaledailynews.com/tags/view/Corporation">Yale Corporation</a>, <a href="http://yaledailynews.com/tags/view/Fourteen%20Colleges">the campus expansion</a>, and <a href="http://yaledailynews.com/tags/view/President%20Levin">Richard Levin</a> himself.  Once on those topic pages, a viewer can click through to any number of other articles, tags, photos, or videos related to that topic.</p>
<p>Cool, huh?</p>
<p>Conceptually, the tagging system is very simple.  When editors upload articles to the site, they add content tags to each stories by hand.  Once the story has been put online, the content management system does the rest of the work.</p>
<p>One of the problems we ran into with this feature was that the editors refused/forgot/didn&#8217;t want to add tags to stories and photos as they upload them to the site.  We quickly realized that if wanted the system to get used, the content management system would have to tag stories automatically.  While still allow editors to tag stories manually, we now tag stories automatically when they&#8217;re uploaded by searching through old articles and guessing where a tag might apply.</p>
<p>While we were initially skeptical about having an auto-tag feature, it turned out to work pretty well.  For example, when looking for articles to tag with <em>Admissions</em>, the system finds:</p>
<ul>
<li> <a href="http://yaledailynews.com/articles/view/24195">Elis graduate with no skills — and few prospects</a></li>
<li><a href="http://yaledailynews.com/articles/view/24181">Ivy admissions prompt frenzy</a></li>
<li><a href="http://yaledailynews.com/articles/view/24166">Acceptance rates decline across the Ancient Eight</a></li>
</ul>
<p>Pretty good.</p>
<p>However, when the auto-tag feature looks for articles to tag with the word <em>Court</em> (as in the New Haven County Superior Court), it doesn&#8217;t do as well:</p>
<ul>
<li><a href="http://yaledailynews.com/articles/view/24197">Appointment of Tony Blair sullies University reputation</a></li>
<li><a href="http://yaledailynews.com/articles/view/24183">Before Yale, Gentry a ‘ﬁnesse player’ on court</a></li>
<li> <a href="http://yaledailynews.com/articles/view/24146">Shin found guilty of faking doctorate</a></li>
</ul>
<p>We generally ended up getting more sports articles with the word <em>court</em> (e.g. basketball court, tennis court, etc.) than articles about the legal system.  Refining the tag name a little, by changing <em>Court</em> to <a href="http://www.yaledailynews.com/tags/view/Crime"><em>Crime</em></a>, makes the search work much better.</p>
<p>To incorporate the tag system into the user experience, we decided to put lots and lots of links to tags pages all over the place.  Every time the editors post an article, the system automatically links every instance of a tagphrase in the article body to its corresponding tag topic page.</p>
<p>Look familiar?  We got some inspiration from the <em><a href="http://www.nytimes.com/">New York Times&#8217;</a></em> application of the same concept.</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/04/03/tags-tags-tags/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Making the Site Faster</title>
		<link>http://online.yaledailynews.com/2008/03/31/making-the-site-faster/</link>
		<comments>http://online.yaledailynews.com/2008/03/31/making-the-site-faster/#comments</comments>
		<pubDate>Mon, 31 Mar 2008 05:51:33 +0000</pubDate>
		<dc:creator>Robert Baskin (Online Director)</dc:creator>
		
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/2008/03/31/making-the-site-faster/</guid>
		<description><![CDATA[Last summer, I interned at Yahoo as Backend Engineer for Yahoo News, which was a fantastic experience. While I was there, Yahoo released a product called YSlow, which is a plug-in for Firebug — the most useful Firefox extension ever created. It analyzes Web pages, tells you why they&#8217;re slow and how to make them [...]]]></description>
			<content:encoded><![CDATA[<p>Last summer, I interned at Yahoo as Backend Engineer for Yahoo News, which was a fantastic experience. While I was there, Yahoo released a product called <a href="http://www.yahooapis.com/yslow/">YSlow</a>, which is a plug-in for <a href="http://www.getfirebug.com/">Firebug</a> — the most useful Firefox extension ever created. It analyzes Web pages, tells you why they&#8217;re slow and how to make them faster. It has inspired many Web developers, myself included, to start thinking more seriously about client-side performance. Server-side performance — MySQL queries, PHP page generation times, etc. — is important, but client-side performance is also very significant.<br />
<span id="more-9"></span></p>
<p>One of the first steps we took was to combine CSS and JavaScript files into one file for production. In development, we split up those files to make them easier to manage. But, in production, you want browsers to have to make as few connections as possible. Each HTTP request for a new file (HTML, CSS, JavaScript, image, Flash object, etc.) takes time to look up the domain name, connect successfully and download the file. Therefore, we combine our code into one CSS and one JS file.</p>
<p>You may notice two things about these files. First of all, we remove all white space and unnecessary characters — for example, the last semicolon in a CSS declaration isn&#8217;t actually needed. We run the <a href="http://code.google.com/p/minify/">Minify</a> library on our files to format them to minimize their size. We also have Apache&#8217;s GZIP module enabled to zip the files as they go over the wire to minimize transfer time and bandwidth costs. GZIPing of CSS and JS files isn&#8217;t enabled by default, because older browsers could not handle it properly, but all <a href="http://developer.yahoo.com/yui/articles/gbs/index.html">modern Grade-A browsers</a> accept it with no problem.</p>
<p>The second thing you may notice is the vXXXXX in the filenames (<a href="http://www.yaledailynews.com/css/ydn-min.v033008213047.css">CSS</a> and <a href="http://www.yaledailynews.com/js/ydn-min.v033008213047.js">JS</a>). We have Expires headers turned on for our CSS and JS files. That means that, when your browser requests them for the first time, we let it know that those files will not change for the next 10 years — forever, as far as the Web is concerned. The next time you load the page, your browser knows they haven&#8217;t changed, so it doesn&#8217;t request them again. But, of course, we do change those files whenever we update our CSS or JS, and the only way to make your browser request them is to change the filename. To avoid having to go through our code and change the filenames every place they&#8217;re included, we pass a &#8220;Last Updated&#8221; time stamp in the filename. Apache Rewrite Rules remove that from the request, so the Web server always serves ydn-min.css or ydn-min.js. All we have to do is update the time stamp whenever we change the CSS or JS.</p>
<p>Those are just some of the performance measures we&#8217;ve enacted. We&#8217;re hoping to do more, both on the server and client side, in the near future.</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/03/31/making-the-site-faster/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Behind the Scenes of the Crime Map</title>
		<link>http://online.yaledailynews.com/2008/03/04/behind-the-scenes-of-the-crime-map/</link>
		<comments>http://online.yaledailynews.com/2008/03/04/behind-the-scenes-of-the-crime-map/#comments</comments>
		<pubDate>Tue, 04 Mar 2008 05:27:05 +0000</pubDate>
		<dc:creator>Robert Baskin (Online Director)</dc:creator>
		
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/2008/03/04/behind-the-scenes-of-the-crime-map/</guid>
		<description><![CDATA[On February 19th, we decided to create the Crime Map. We launched it March 3rd, just 13 days later. This is our story.
We decided to retrieve the crimes data from the Yale Police Department daily crime logs. The next task was how to present the data. We knew we wanted a map, and for a [...]]]></description>
			<content:encoded><![CDATA[<p>On February 19th, we decided to create the Crime Map. We launched it March 3rd, just 13 days later. This is our story.</p>
<p>We decided to retrieve the crimes data from the Yale Police Department <a href="http://www.yale.edu/police/crimelog.html">daily crime logs</a>. The next task was how to present the data. We knew we wanted a map, and for a while we toyed with the idea of creating our own Flash map. Google Maps was always an option, but we can&#8217;t customize the look and feel as much, and it doesn&#8217;t show the names of campus buildings. Eventually, we decided that an established mapping solution with an easy API was worth using, so we chose Google Maps. <span id="more-8"></span>The only major problem we had with it was that their <a href="http://code.google.com/apis/maps/documentation/reference.html#GMarkerManager">GMarkerManager </a>doesn&#8217;t have a removeAll() or clear() function (which we needed when we cleared the map to show new crimes), so we turned to the basically second-party <a href="http://gmaps-utility-library.googlecode.com/svn/trunk/markermanager/release/src/markermanager.js">MarkerManager </a>script. It&#8217;s a bit slow with a<br />
lot of markers, so we limit the results to 100 crimes so the browser doesn&#8217;t hang for 5-10 seconds.</p>
<p>Then we turned to the table of crimes, which we had to regenerate on every Ajax request. We went with <a href="http://wiki.script.aculo.us/scriptaculous/show/Builder">Scriptaculous&#8217; Builder functionality</a>. As we can say about many things, it worked great. in everything except Internet Explorer. First of all, IE demands that all nodes are appended to a &lt;tbody&gt; instead of directly to the table. And it requires that you build the entire &lt;tbody&gt; before appending it to the &lt;table&gt;. But we finally got it working. And after some basic testing, we actually found that it was faster than creating the table by setting innerHTML, but we may revisit that. (We tested this with <a href="http://www.getfirebug.com/">Firebug&#8217;s</a> JavaScript profilers in Firefox 3 Beta 3, so the results are definitely not comprehensive.)</p>
<p>We ran into some fun issues with the tab box structure for the form on the left - look for an upcoming blog post here that will talk a bit more about the tab boxes.</p>
<p>Overall, we&#8217;re very excited about what we came up with. We think the data is relevant and well-presented, and it&#8217;s very cool to take a feature from conception to launch in less than two weeks. We hope that you find it useful.</p>
<p>Robert Baskin<br />
Online Director<br />
Yale Daily News</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/03/04/behind-the-scenes-of-the-crime-map/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Announcing the Yale Daily News Crime Map</title>
		<link>http://online.yaledailynews.com/2008/03/03/announcing-the-yale-daily-news-crime-map/</link>
		<comments>http://online.yaledailynews.com/2008/03/03/announcing-the-yale-daily-news-crime-map/#comments</comments>
		<pubDate>Mon, 03 Mar 2008 12:00:39 +0000</pubDate>
		<dc:creator>Andrew Mangino (Editor-in-Chief)</dc:creator>
		
		<category><![CDATA[Content]]></category>

		<guid isPermaLink="false">http://online.yaledailynews.com/2008/03/03/announcing-the-yale-daily-news-crime-map/</guid>
		<description><![CDATA[Dear YaleDailyNews.com subscribers,
It may only have &#8220;Yale&#8221; in its title, but the Oldest College Daily has long included comprehensive coverage of New Haven as well. In the spirit of that dual mission, today we launch an interactive crime map for Yale&#8217;s campus and the surrounding neighborhoods, with crimes — sortable by type, date or location [...]]]></description>
			<content:encoded><![CDATA[<p>Dear YaleDailyNews.com subscribers,</p>
<p>It may only have &#8220;Yale&#8221; in its title, but the Oldest College Daily has long included comprehensive coverage of New Haven as well. In the spirit of that dual mission, today we launch an interactive crime map for Yale&#8217;s campus and the surrounding neighborhoods, with crimes — sortable by type, date or location — dating back more than a year. You may, for example, select your residential college (if applicable) and see every crime committed nearby over the past week, month or year. (You may also select the two proposed residential colleges on Grove Street!)</p>
<p>You can access the map at <a href="http://www.yaledailynews.com/crimes">http://www.yaledailynews.com/crimes</a>. As always, please send your comments and questions to <a href="mailto:editor@yaledailynews.com">editor@yaledailynews.com</a>.</p>
<p>Sincerely,<br />
Andrew Mangino<br />
Editor in Chief<br />
Yale Daily News</p>
]]></content:encoded>
			<wfw:commentRss>http://online.yaledailynews.com/2008/03/03/announcing-the-yale-daily-news-crime-map/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
