<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Random Thoughts &#187; random</title>
	<atom:link href="http://alexlurthu.wordpress.com/category/random/feed/" rel="self" type="application/rss+xml" />
	<link>http://alexlurthu.wordpress.com</link>
	<description>Straight from the heart!</description>
	<lastBuildDate>Fri, 04 Dec 2009 20:54:08 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<cloud domain='alexlurthu.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/2e18259658cdc132622f409f5fd659ca?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>Random Thoughts &#187; random</title>
		<link>http://alexlurthu.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://alexlurthu.wordpress.com/osd.xml" title="Random Thoughts" />
		<item>
		<title>MogileFS &#8211; Platform to Store / Retrieve Files</title>
		<link>http://alexlurthu.wordpress.com/2007/07/30/mogilefs-platform-to-store-retrieve-files/</link>
		<comments>http://alexlurthu.wordpress.com/2007/07/30/mogilefs-platform-to-store-retrieve-files/#comments</comments>
		<pubDate>Mon, 30 Jul 2007 07:50:39 +0000</pubDate>
		<dc:creator>Alex</dc:creator>
				<category><![CDATA[random]]></category>
		<category><![CDATA[file store]]></category>
		<category><![CDATA[framework]]></category>

		<guid isPermaLink="false">http://alexlurthu.wordpress.com/2007/07/30/mogilefs-platform-to-store-retrieve-files/</guid>
		<description><![CDATA[
Storing files/images/binary data into the database involves lots of performance issues. The more widely  recommend solution for such a scenario is to push the files onto disk and store the metadata about the file and its location into a database. This has its pros and cons. If you have such a need and dont [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alexlurthu.wordpress.com&blog=1310241&post=26&subd=alexlurthu&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p align="justify">
<p align="justify">Storing files/images/binary data into the database involves lots of performance issues. The more widely  recommend solution for such a scenario is to push the files onto disk and store the metadata about the file and its location into a database. This has its pros and cons. If you have such a need and dont have the time to build a custom solution you can try MogileFS/Compete filesystem/Bit Mountain. If you are from perl background you can use MogileFS. For python you can try CFS/BM.</p>
<p align="justify"><a href="http://mogilefs.schtuff.com/" target="_blank">MogileFs</a> usage flow is as follows :</p>
<p align="justify">
<ul>
<li> app requests to open a file (does RPC via library to a tracker, finding whichever one is up).  does a &#8220;create_open&#8221; request.</li>
<li> tracker makes some load balancing decisions about where it could go, and gives app a few possible locations</li>
<li> app writes to one of the locations (if it fails writing to one midway, it can retry and write elsewhere).</li>
<li> app (client) tells tracker where it wrote to in the &#8220;create_close&#8221; API.</li>
<li> tracker then links that name into the domain&#8217;s namespace (via the database)</li>
<li> tracker, in the background, starts replicating that file around until it&#8217;s in compliance with that file class&#8217;s replication policy</li>
<li> later, app issues a &#8220;get_paths&#8221; request for that domain+key (key == &#8220;filename&#8221;), and tracker replies (after consulting database/memcache/etc), all the URLs that the file is available at, weighted based on I/O utilization at each location.</li>
<li> app then tries the URLs in order. (although the tracker&#8217;s continually monitoring all hosts/devices, so won&#8217;t return dead stuff, and by default will double-check the existence of the 1st item in the returned list, unless you ask it not to&#8230;)</li>
</ul>
<p align="justify">In the above flow, one quirky thing is once the file is pushed on to the disk then the app tells tracker where it stored the file. Since storing of the file and metadata info is not atomic, there is high possibility of syncing issues. Will explore further to see if they have some internal mechanism to handle this quirk.</p>
<p align="justify"><!-- technorati tags end --></p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/alexlurthu.wordpress.com/26/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/alexlurthu.wordpress.com/26/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alexlurthu.wordpress.com/26/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alexlurthu.wordpress.com/26/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alexlurthu.wordpress.com/26/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alexlurthu.wordpress.com/26/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alexlurthu.wordpress.com/26/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alexlurthu.wordpress.com/26/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alexlurthu.wordpress.com/26/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alexlurthu.wordpress.com/26/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alexlurthu.wordpress.com/26/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alexlurthu.wordpress.com/26/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alexlurthu.wordpress.com&blog=1310241&post=26&subd=alexlurthu&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://alexlurthu.wordpress.com/2007/07/30/mogilefs-platform-to-store-retrieve-files/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9200ed713840b8e3b58d6c565a85e946?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Alex</media:title>
		</media:content>
	</item>
		<item>
		<title>Some interesting commonly used words in meetings</title>
		<link>http://alexlurthu.wordpress.com/2007/07/22/some-interesting-commonly-used-words-in-meetings/</link>
		<comments>http://alexlurthu.wordpress.com/2007/07/22/some-interesting-commonly-used-words-in-meetings/#comments</comments>
		<pubDate>Sun, 22 Jul 2007 08:25:39 +0000</pubDate>
		<dc:creator>Alex</dc:creator>
				<category><![CDATA[personal]]></category>
		<category><![CDATA[random]]></category>

		<guid isPermaLink="false">http://alexlurthu.wordpress.com/2007/07/22/some-interesting-commonly-used-words-in-meetings/</guid>
		<description><![CDATA[Synergy, strategic fit, core competencies, best practice, bottom line, revisit, expeditious, to tell you the truth (or the truth is), 24/7, out of the loop, benchmark, value-added, proactive, win-win, think outside the box, fast track, result-driven, knowledge base, at the end of the day, touch base, mindset, client focus(ed), paradigm, game plan, leverage.
   [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alexlurthu.wordpress.com&blog=1310241&post=17&subd=alexlurthu&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p align="justify">Synergy, strategic fit, core competencies, best practice, bottom line, revisit, expeditious, to tell you the truth (or the truth is), 24/7, out of the loop, benchmark, value-added, proactive, win-win, think outside the box, fast track, result-driven, knowledge base, at the end of the day, touch base, mindset, client focus(ed), paradigm, game plan, leverage.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/alexlurthu.wordpress.com/17/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/alexlurthu.wordpress.com/17/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alexlurthu.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alexlurthu.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alexlurthu.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alexlurthu.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alexlurthu.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alexlurthu.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alexlurthu.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alexlurthu.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alexlurthu.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alexlurthu.wordpress.com/17/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alexlurthu.wordpress.com&blog=1310241&post=17&subd=alexlurthu&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://alexlurthu.wordpress.com/2007/07/22/some-interesting-commonly-used-words-in-meetings/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9200ed713840b8e3b58d6c565a85e946?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Alex</media:title>
		</media:content>
	</item>
		<item>
		<title>Learning search technology</title>
		<link>http://alexlurthu.wordpress.com/2006/06/19/learning-search-technology/</link>
		<comments>http://alexlurthu.wordpress.com/2006/06/19/learning-search-technology/#comments</comments>
		<pubDate>Mon, 19 Jun 2006 07:03:00 +0000</pubDate>
		<dc:creator>Alex</dc:creator>
				<category><![CDATA[random]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://alexlurthu.wordpress.com/2006/06/19/learning-search-technology/</guid>
		<description><![CDATA[After a very long time , i am creating an entry. Have started to attend webinars on Search Technologies and  am planning to blog my learnings. Probably it will be kind of notes to the webinars.
To get an high level overview of search architecture, read the Anatomy of Large-Scale hypertext Web Search Engine by [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alexlurthu.wordpress.com&blog=1310241&post=8&subd=alexlurthu&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>After a very long time , i am creating an entry. Have started to attend webinars on Search Technologies and  am planning to blog my learnings. Probably it will be kind of notes to the webinars.</p>
<p>To get an high level overview of search architecture, read the Anatomy of Large-Scale hypertext Web Search Engine by Brin and Page , which they published during their stint in Standford.</p>
<p>Search involves 3 major processes.</p>
<p>1. The Spider / Crawler that crawls all the web pages.</p>
<p>Care should be taken that the crawlers we write should adhere to standards like providing identification about from where the crawler is coming from, adhereing to the robots.txt file which indicates which part6s of the site the crawler can crawl and which crawlers can crawl etc and also should ensure that we dont bring down sites by running our crawlers.</p>
<p>2. The Indexing process</p>
<p>Inverted indexes are used to store the crawled data. The indexing process involves generating list of keywords from the content of the page , the proximity of the key words , location of the words etc are generated and stored in the index with uniquely generated document id.</p>
<p>3. The Lookup or Retrieval Process with Ranking.</p>
<p>The search servers provide us with the UI to enter the search terms we want to search on. Once the terms are submitted, the search servers look up the inverted index to obtain the documents that have search terms as keywords. Once it has the list it ranks the documents based on the frequency of the apprearence of the search terms, the location of the terms, the anchor text pointing to the document, the importance of sites that points to that particular document etc. The documents get displayed based on ranking they get through the ranking process. One famous algorithm is Google&#8217;s Page Rank algorithm.</p>
<p>Some interesting links to learn about search engines :</p>
<p>www.searchenginewatch.com<br />
www.searchenginejournal.com<br />
www.searchengineshowdown.com</p>
<p>http://battellemedia.com</p>
<p>www.searchtools.com</p>
<p>Happy Learning !!</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/alexlurthu.wordpress.com/8/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/alexlurthu.wordpress.com/8/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alexlurthu.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alexlurthu.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alexlurthu.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alexlurthu.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alexlurthu.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alexlurthu.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alexlurthu.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alexlurthu.wordpress.com/8/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alexlurthu.wordpress.com/8/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alexlurthu.wordpress.com/8/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alexlurthu.wordpress.com&blog=1310241&post=8&subd=alexlurthu&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://alexlurthu.wordpress.com/2006/06/19/learning-search-technology/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9200ed713840b8e3b58d6c565a85e946?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Alex</media:title>
		</media:content>
	</item>
		<item>
		<title>ISO&#8217;s and BIN/CUE</title>
		<link>http://alexlurthu.wordpress.com/2006/03/28/isos-and-bincue/</link>
		<comments>http://alexlurthu.wordpress.com/2006/03/28/isos-and-bincue/#comments</comments>
		<pubDate>Tue, 28 Mar 2006 15:34:00 +0000</pubDate>
		<dc:creator>Alex</dc:creator>
				<category><![CDATA[random]]></category>
		<category><![CDATA[bin]]></category>
		<category><![CDATA[cue]]></category>
		<category><![CDATA[iso]]></category>

		<guid isPermaLink="false">http://alexlurthu.wordpress.com/2006/03/28/isos-and-bincue/</guid>
		<description><![CDATA[Utility to convert BIN/CUE CD image to ISO
bchunk xyz.bin xyz.cue xyz

Command to mount iso images in linux
mount -o loop -t iso9660 file.iso /mnt/test
       <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alexlurthu.wordpress.com&blog=1310241&post=5&subd=alexlurthu&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><span style="font-weight:bold;">Utility to convert BIN/CUE CD image to ISO</span></p>
<p>bchunk xyz.bin xyz.cue xyz<br />
<span><br />
<span style="font-weight:bold;">Command to mount iso images in linux</span></span></p>
<p><span>mount -o loop -t iso9660 file.iso /mnt/test</span></p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/alexlurthu.wordpress.com/5/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/alexlurthu.wordpress.com/5/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/alexlurthu.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/alexlurthu.wordpress.com/5/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/alexlurthu.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/alexlurthu.wordpress.com/5/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/alexlurthu.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/alexlurthu.wordpress.com/5/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/alexlurthu.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/alexlurthu.wordpress.com/5/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/alexlurthu.wordpress.com/5/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/alexlurthu.wordpress.com/5/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=alexlurthu.wordpress.com&blog=1310241&post=5&subd=alexlurthu&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://alexlurthu.wordpress.com/2006/03/28/isos-and-bincue/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9200ed713840b8e3b58d6c565a85e946?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96" medium="image">
			<media:title type="html">Alex</media:title>
		</media:content>
	</item>
	</channel>
</rss>