<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Choose lossless VCS tools if you have that luxury</title>
	<atom:link href="http://www.markshuttleworth.com/archives/125/feed" rel="self" type="application/rss+xml" />
	<link>http://www.markshuttleworth.com/archives/125</link>
	<description>Planetary perspectives</description>
	<lastBuildDate>Mon, 06 Feb 2012 03:55:46 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: biopython: looking for a new VCS? &#124; Bioinfo Blog!</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-316667</link>
		<dc:creator>biopython: looking for a new VCS? &#124; Bioinfo Blog!</dc:creator>
		<pubDate>Mon, 16 Feb 2009 21:53:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-316667</guid>
		<description>[...] Mark Shuttlework - Choose lossless VCS tools if you have that luxury (Mark Shuttlework) (about migrating to svn or git, and bazaar) [...]</description>
		<content:encoded><![CDATA[<p>[...] Mark Shuttlework &#8211; Choose lossless VCS tools if you have that luxury (Mark Shuttlework) (about migrating to svn or git, and bazaar) [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: beza1e1</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-146941</link>
		<dc:creator>beza1e1</dc:creator>
		<pubDate>Tue, 14 Aug 2007 08:52:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-146941</guid>
		<description>I want to do some research on this within a study thesis. Maybe you&#039;d like to comment on my evaluation strategy?

http://computerroriginaliascience.blogspot.com/2007/08/how-to-evaluate-dvcs.html</description>
		<content:encoded><![CDATA[<p>I want to do some research on this within a study thesis. Maybe you&#8217;d like to comment on my evaluation strategy?</p>
<p><a href="http://computerroriginaliascience.blogspot.com/2007/08/how-to-evaluate-dvcs.html" rel="nofollow">http://computerroriginaliascience.blogspot.com/2007/08/how-to-evaluate-dvcs.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Petty Pace &#187; Blog Archive &#187; Heartbeats and Sails</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-143006</link>
		<dc:creator>Petty Pace &#187; Blog Archive &#187; Heartbeats and Sails</dc:creator>
		<pubDate>Sat, 04 Aug 2007 03:49:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-143006</guid>
		<description>[...] Mark Shuttleworth: What’s good enough performance? Well, I like to think in terms of “heartbeat time”. If the major operations which I have to do regularly (several times in an hour) take less than a heartbeat, then I don’t ever feel like I’m waiting. Things which happen 3-5 times in a day can take a bit longer, up to a minute, and those fit with regular workbreaks that I would take anyhow to clear my head for the next phase of work, or rest my aching fingers. [...]</description>
		<content:encoded><![CDATA[<p>[...] Mark Shuttleworth: What’s good enough performance? Well, I like to think in terms of “heartbeat time”. If the major operations which I have to do regularly (several times in an hour) take less than a heartbeat, then I don’t ever feel like I’m waiting. Things which happen 3-5 times in a day can take a bit longer, up to a minute, and those fit with regular workbreaks that I would take anyhow to clear my head for the next phase of work, or rest my aching fingers. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Version Control: Design for Integration &#171; Agile Teams, Open Software, Passionate Users</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-140333</link>
		<dc:creator>Version Control: Design for Integration &#171; Agile Teams, Open Software, Passionate Users</dc:creator>
		<pubDate>Mon, 30 Jul 2007 09:54:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-140333</guid>
		<description>[...] loss when interchanging data with the core product. Mark Shuttleworth captures this point nicely in Choose lossless VCS tools if you have that luxury. Truly caring about integration goes even deeper in my opinion: it means explicitly making it [...]</description>
		<content:encoded><![CDATA[<p>[...] loss when interchanging data with the core product. Mark Shuttleworth captures this point nicely in Choose lossless VCS tools if you have that luxury. Truly caring about integration goes even deeper in my opinion: it means explicitly making it [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: brendan</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-121587</link>
		<dc:creator>brendan</dc:creator>
		<pubDate>Tue, 26 Jun 2007 18:52:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-121587</guid>
		<description>Actually, git can only get copy 100% right if nothing is changed in the target file in the same commit. Otherwise, you have to give it a similarity threshold (70%? 80%?) for it to decide whether something is a copy. This is, of course, a guess, resulting from lost information. Nothing stops other systems from making the same guess if they have to (for instance, mercurial supports a similarity option for addremove), but it is better to have the information.

git also pays a performance penalty for digging up copy and rename information, which is why you need flags like --find-copies-harder etc.</description>
		<content:encoded><![CDATA[<p>Actually, git can only get copy 100% right if nothing is changed in the target file in the same commit. Otherwise, you have to give it a similarity threshold (70%? 80%?) for it to decide whether something is a copy. This is, of course, a guess, resulting from lost information. Nothing stops other systems from making the same guess if they have to (for instance, mercurial supports a similarity option for addremove), but it is better to have the information.</p>
<p>git also pays a performance penalty for digging up copy and rename information, which is why you need flags like &#8211;find-copies-harder etc.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bruce</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-121505</link>
		<dc:creator>Bruce</dc:creator>
		<pubDate>Tue, 26 Jun 2007 16:51:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-121505</guid>
		<description>Legal proceedings judged Microsoft a monopoly.

Anyone who does business as a partner with Microsoft should know that first and ensure that they have adequate legal safeguards.

Which of Microsoft&#039;s joint business ventures with other companies did not leave Microsoft as the sole beneficiary of that arrangement?

Microsoft may want to tie others to a standard and thus be free to innovate and to create new ad hoc standards using their monopoly power.

Today, IT workers spend more of their effort using Microsoft&#039;s software than creating new applications themselves.  The overhead of using such tools saps the workforce.  We were far more productive using less innovation some years ago.</description>
		<content:encoded><![CDATA[<p>Legal proceedings judged Microsoft a monopoly.</p>
<p>Anyone who does business as a partner with Microsoft should know that first and ensure that they have adequate legal safeguards.</p>
<p>Which of Microsoft&#8217;s joint business ventures with other companies did not leave Microsoft as the sole beneficiary of that arrangement?</p>
<p>Microsoft may want to tie others to a standard and thus be free to innovate and to create new ad hoc standards using their monopoly power.</p>
<p>Today, IT workers spend more of their effort using Microsoft&#8217;s software than creating new applications themselves.  The overhead of using such tools saps the workforce.  We were far more productive using less innovation some years ago.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ¡darandandunguen! &#187; Blog Archive &#187; ¿Red Hat, Ubuntu, Mandriva &#38; Microsoft? ¡No!</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-116820</link>
		<dc:creator>¡darandandunguen! &#187; Blog Archive &#187; ¿Red Hat, Ubuntu, Mandriva &#38; Microsoft? ¡No!</dc:creator>
		<pubDate>Wed, 20 Jun 2007 12:38:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-116820</guid>
		<description>[...] de negativas a nuevos acuerdos con Microsoft: Red Hat, Ubuntu y Mandriva.   These icons link to social bookmarking sites where readers can share and discover new [...]</description>
		<content:encoded><![CDATA[<p>[...] de negativas a nuevos acuerdos con Microsoft: Red Hat, Ubuntu y Mandriva.   These icons link to social bookmarking sites where readers can share and discover new [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sam Vilain</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-116603</link>
		<dc:creator>Sam Vilain</dc:creator>
		<pubDate>Wed, 20 Jun 2007 05:01:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-116603</guid>
		<description>Another point - only those systems which use revision numbers which hash the content and history to that point (ie, monotone, hz and monotone) are actually checking the integrity of the content by design as they go.  It is therefore possible (unless you use Testaments for every revision) that your bzr repository could have historical corruption (more likely tampering) without you noticing.

So, historical tampering would result in what comes out of bzr not being the same as what went in.  And people who copied the tampered repository would never know.

Which one is &quot;lossless&quot;, again?</description>
		<content:encoded><![CDATA[<p>Another point &#8211; only those systems which use revision numbers which hash the content and history to that point (ie, monotone, hz and monotone) are actually checking the integrity of the content by design as they go.  It is therefore possible (unless you use Testaments for every revision) that your bzr repository could have historical corruption (more likely tampering) without you noticing.</p>
<p>So, historical tampering would result in what comes out of bzr not being the same as what went in.  And people who copied the tampered repository would never know.</p>
<p>Which one is &#8220;lossless&#8221;, again?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andy Parkins</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-116185</link>
		<dc:creator>Andy Parkins</dc:creator>
		<pubDate>Tue, 19 Jun 2007 15:08:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-116185</guid>
		<description>I think you&#039;ve seriously misunderstood git.  It doesn&#039;t track renames because that&#039;s the wrong thing to do.  It can easily reconstruct that information later though.  This is not the &quot;interpolation&quot; you mention in your post; it is not guessing, git would know.  How?  Let me explain:

What git actually stores is the content of the files; it stores them in a file named as the hash of the contents - let&#039;s pretend that hash is 12345ABCDEF.  Git then keeps a list of hashes against filenames; so:

12345ABCDEF somefile.c

Now, let&#039;s say you rename that file, from one commit to the next; git will store the new list as

12345ABCDEF newname.c

Notice that the hash hasn&#039;t changed.  So, when you compare these two lists, it&#039;s really easy to see that somefile.c was renamed to newname.c, because the hash is common to both.  Similarly for copies:

12345ABCDEF somefile.c
12345ABCDEF copy.c

Comparing this list with the original one, it&#039;s easy to see that  somefile.c hasn&#039;t changed, but copy.c has been introduced and is a copy of somefile.c.

See?  Git didn&#039;t need to record the rename explicitly - it&#039;s inherently available in what it does store.

What&#039;s even better is that you get the copy free, because it&#039;s got the same content, this list just references the same content twice, and on checkout git reads the same source object twice.

As it happens, git is even cleverer than I&#039;ve described above and can make educated guesses about copies and renames that were changed during the revisions.  

What&#039;s really great about this, is that git figures it out on its own.  So you don&#039;t need special commands for copy, move, mkdir, rm, etc.  Git knows what you&#039;ve done because you did it, not because you told it you did it.

Begin fair, there are currently a few UI issues with git erring on the side of speed by default, and not doing these detections as it parses history.  However, they aren&#039;t particularly expensive operations and so if you wish (as I do), you can make git always detect these copies and renames.  However that is a user interface issue, and should not be used to say that git doesn&#039;t track renames.  Who cares that it doesn&#039;t track them - it can show you, the user, them, which is all that matters.</description>
		<content:encoded><![CDATA[<p>I think you&#8217;ve seriously misunderstood git.  It doesn&#8217;t track renames because that&#8217;s the wrong thing to do.  It can easily reconstruct that information later though.  This is not the &#8220;interpolation&#8221; you mention in your post; it is not guessing, git would know.  How?  Let me explain:</p>
<p>What git actually stores is the content of the files; it stores them in a file named as the hash of the contents &#8211; let&#8217;s pretend that hash is 12345ABCDEF.  Git then keeps a list of hashes against filenames; so:</p>
<p>12345ABCDEF somefile.c</p>
<p>Now, let&#8217;s say you rename that file, from one commit to the next; git will store the new list as</p>
<p>12345ABCDEF newname.c</p>
<p>Notice that the hash hasn&#8217;t changed.  So, when you compare these two lists, it&#8217;s really easy to see that somefile.c was renamed to newname.c, because the hash is common to both.  Similarly for copies:</p>
<p>12345ABCDEF somefile.c<br />
12345ABCDEF copy.c</p>
<p>Comparing this list with the original one, it&#8217;s easy to see that  somefile.c hasn&#8217;t changed, but copy.c has been introduced and is a copy of somefile.c.</p>
<p>See?  Git didn&#8217;t need to record the rename explicitly &#8211; it&#8217;s inherently available in what it does store.</p>
<p>What&#8217;s even better is that you get the copy free, because it&#8217;s got the same content, this list just references the same content twice, and on checkout git reads the same source object twice.</p>
<p>As it happens, git is even cleverer than I&#8217;ve described above and can make educated guesses about copies and renames that were changed during the revisions.  </p>
<p>What&#8217;s really great about this, is that git figures it out on its own.  So you don&#8217;t need special commands for copy, move, mkdir, rm, etc.  Git knows what you&#8217;ve done because you did it, not because you told it you did it.</p>
<p>Begin fair, there are currently a few UI issues with git erring on the side of speed by default, and not doing these detections as it parses history.  However, they aren&#8217;t particularly expensive operations and so if you wish (as I do), you can make git always detect these copies and renames.  However that is a user interface issue, and should not be used to say that git doesn&#8217;t track renames.  Who cares that it doesn&#8217;t track them &#8211; it can show you, the user, them, which is all that matters.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sam Vilain</title>
		<link>http://www.markshuttleworth.com/archives/125/comment-page-1#comment-115743</link>
		<dc:creator>Sam Vilain</dc:creator>
		<pubDate>Tue, 19 Jun 2007 01:26:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.markshuttleworth.com/archives/125#comment-115743</guid>
		<description>&quot;Lossless&quot; is a relative term.  If you define what the programmer comes in as 100% quality, then sure you&#039;re right.  However once you realise that they are human, and will make mistakes, then the whole &quot;lossless&quot; argument is undermined.

Then you realise you&#039;re back with a warehouse of revisions, and looking back at history is not following the breadcrumbs left by the original developers as gospel but instead *data mining*.</description>
		<content:encoded><![CDATA[<p>&#8220;Lossless&#8221; is a relative term.  If you define what the programmer comes in as 100% quality, then sure you&#8217;re right.  However once you realise that they are human, and will make mistakes, then the whole &#8220;lossless&#8221; argument is undermined.</p>
<p>Then you realise you&#8217;re back with a warehouse of revisions, and looking back at history is not following the breadcrumbs left by the original developers as gospel but instead *data mining*.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

