Few interesting links

2007/04/30

Repository of presentations – http://www.slideshare.net/ on variety of topics.

Specialize code-only search engine: http://www.krugle.com/

New Google product on vertical searchGoogle Co-op.


New great podcast

2007/04/27

Running out of Security Now! and TWIT episodes, I have subscribed and started listening to few newly discovered podcasts.

I have started to listen to the .NET specific one from Scott Hanselman named Hanselminutes. There are couple of factors that makes it better that other geeky blogs out there. First reason is the content quality. There is very high signal-to-noise ratio, pretty much all content counts and both guys in addition to being smart are quite good at achieving right balance between keeping on topic and spontaneity.

Second reason – it is not technology-religious and quite pragmatic. Scott obviously likes .NET and is passionate about Microsoft technologies – but there is no sucking-up, Scott is very open minded – just listen to the Dynamic languages where they talk about Ruby on Rails. He even owns a Mac and tests multiplatform software on multiple platforms 🙂

Third reason: there is PDF transcript available with lots of good links which would be otherwise lost (unless you listen with pen in your hand and in front of computer, and not driving or walking as myself.)

And last but not least – very good audio quality, professionally recorded and processed. After listening to this episode about professional audio processing, it was clear why. Episodes are reasonably short – 20 to 40 minutes

I have learned quite a lot from the about variety of interesting things – e.g that WPF/E (recently named Silverlight) may be actually something I really want to look at :-). It almost sounded too good to be true.

So if .NET is part of your world – or you want it to become part of your world – go for HanselMinutes.


Accessing SQL Server 2005 from Ruby

2007/04/25

The natural gravity of our environment keeps pulling me out of scripting and Ruby world back to .Net zone :-). Out of curiosity, I looked into how easy or hard is it to access other database from Ruby, than the default MySQL which dominates all books, samples and tutorials.

I have found three ways (at least), one likely platform independent and two Windows only. The platform independent requires module ADO.rb and works with the DBI interface. This sounds very similar to the way how Perl used to access databases, so I put this at the end of the list (some details are here). First Windows only methods used ODBC – no, thanks. Configuring machine specific data sources is not the way I wanted.

The best way was described in the Dave Mullet’s blog and worked perfectly. I took the liberty to make the class better configurable – the result is published at code snippets.

If you do not know DZone Snippets, give it a look – it is best way how to share chunks of code that are too big to fit into blog (like one-liners or small functions) but not really a projects, which could be uploaded to SourceForge or CodePlex or similar open source repository. Snippets have coloring filter and tagging system – and the site is YACLRRW2C – yet another cool looking Ruby on Rails Web 2.0-ish creation …

  


In praise of the polyglot

2007/04/24

If you’ve ever envied the ability of multilingual friends to bridge language gaps wherever they travel or gained new appreciation for your native language by learning a new one, then you can probably see the advantage of being a programming language polyglot. Developer polyglots travel the IT world more freely than do monolinguists (sure that they can apply their skills in any environment), and they also tend to better appreciate the programming language called home, because among other things they know the roots from which that home is sprung. Isn’t it time you became a polyglot?

From http://www-128.ibm.com/developerworks/library/j-ruby/?ca=dgr-jw22RubyOffRails

I certainly feel like it which explains the recent ruby-esque and rail-ish diversions :-). To compensate the high level coding in Ruby, Objective C as next will bring me back down to the metal, I mean Core 🙂


One more reason to like TWIT

2007/04/23

As it happens, both my most favorite podcasters – Leo Laporte as well as Steve Gibson are passionate e-book readers. They have both purchased and are (mostly) happy users of the Sony PRS-500 aka Sony Reader in addition to several other platforms – (Palm, Pocket PC). In the latest podcasts – both in TWIT as well as Security Now!, the eBooks and eReader got quite some publicity :-).

Both had similar experience with PDF format on PRS-500 as myself – but Steve mentioned the PDF re-flowing the document to make it better suites for small screen – something I have looked into but did not get it working quite right. The preferred approach to using Reader is RTF format – which again, matches my impressions.

Nice discovery was that some independent sources – other than Sony store – are offering ebooks in LRF format. As an example see McCollum’s books – a hardcore, scientific sci-fi (no dwarves, elves, spells and dragons here :-)). Thanks for tips, gentlemen – I have added McCollum to my reading list.


Four quadrants of uncertainty

2007/04/21

(a followup to discussion last week)

In every outsourced software project you have four large areas: your customer, the business vertical you are writing software for, your implementation team and the technology foundation or platform you are using.

Two of these areas are on “internal” side of your organization (team and technology), two of them are on external – your customer side (business domain and customer organization). Two are more micro level in the sense that they have strong human interaction aspect (customer and team) and two are more macrolevel and less personal (business domain and technology). For that reason, you can represent these areas as quadrants using two axis micro/macro and internal/external: micro+internal = team, micro+external = customer, macro+internal = platform, macro+external = business domain.The degree of risk in the project is proportional to how many of these quadrants are actually unknown or new. Why is that ?

Having a new customer, you do not have established the communication channels, the degree of trust based on past successes is not there. You do not know the customer’s corporate culture, internal structure, relations, influencers and inner politics that in theory should be completely irrelevant, but in reality may be one of the most disruptive discoveries. As one wise man said “It is usually politics that gets you into troubles, very seldom technology”.

Starting project with new, untested team, the team lead does not know what would be the best work division between the developers. Eventually, it can be discovered over time, but reshuffling the position has disruptive effect and usually requires some rework or refactoring. It is also very hard to make estimates as the team members productivity is unknown and does depend on the task.

Switching to new technology, obviously, you will have to incorporate some learning period and count on initial slowdown until everybody is as proficient in “the new great thing” as well as they were in the “old boring one” :-). This can be planned for and incoporated into project plan – or ideally, fixed before the project start on a prototype-slash-pilot project. Something you cannot really plan for is that you and your team may be confronted with unpleasant surprise late in the development cycle. The design or even the architecture may contain problems related to incomplete understanding of all aspect of used technology. As very good example is not so old DCOM and distributed applications developed using these technology. In development lab conditions, the applications work great, scale well, deployment is easy and running it is no-brainer. In real deployment scenarios where the fast LAN is replaced by VPN tunnels over internet and networking realities such as variable speed, reliability issues, latency, lots of firewalls etc, you have suddenly different picture. I still remember the fun of switching from C++ to Java back at the end of the last century – and few years later switching from Java to C#.

The new business area means that your company and development team enters an applications space in which you have not worked before: e.g. health, genetics, nanotechnology. This may create some communication issues between you and the client who is speaking the language you must first learn. The practical impact is that you do not know the existing solutions in the area, libraries, frameworks, tools available – and very easily you may find yourself reinventing the wheel.

The known/unknown is not a boolean variable, of course, more likely a number in range (let’s say) 0 to 3, where 0 is perfectly known territory (like writing seventh Order Tracking system in J2EE, using again Struts and JSP). Number 1 represents some minor unknown areas (for example switching from Struts to Spring or from ASP.NET 1.1 to 2.0). Number two is major change in otherwise known area (like switching from ASP.NET to Winforms or to WPF – but still staying in .NET and C#) and number 3 is something completely new – e.g. Java team trying out Ruby on Rails.

Now evaluate all four quadrant using this scale and add the numbers up to get the risk score. Starting project with risk score in two digits is either gamble or act of desperate measures – and chances of delivering high quality solutions addressing the requirements, on time and within budget are not great. The area between 7 and 9 is challenge zone – you have to expect bumpy ride, but once you do reach the target zone – oh boy, what a feeling. As side effect, when you succeed, usually you make huge progress in your delivery capabilities, team experience, customer trust level or both.

The 4 to 6 is healthy grow. There is very high level of confidence in the project outcome and still enough space to deal with unexpected, get the best out of the team and grow. This is where you want to be most of the time.

The 2-3 score is comfort zone – kind of routine, where you are sort of growing and evolving, but quite slowly. It can make sense under certain situations. I have seen this happening back in dot-com boom when one of our customers kept asking for more and more work on support for their large application that it saturated our delivery capability for quite some period of time. All we could do was hiring and training people as quickly as we could to keep up with the demand. But under normal circumstances, if you find yourself in this zone – wake up and make sure you do not slip into the “binary” zone, where all you to represent the uncertainty factor is single bit. Why ?

Because the levels of 0 and 1 are “death by stagnation”. You do not want to be here. No change is bad. Having always same customer, same technology, same business area and stable, unchanging delivery team means that you are not growing: not getting new customers, not discovering new business areas, not keeping up to date with new technologies and not hiring. Definitely not having fun. With that, your skills portfolio will become obsolete and not challenged developers will look for more fun elsewhere.


Unify Tour 2007 and Open Source

2007/04/19

This week was full with attending the demos and workshops. On Tuesday, I spent half day on the Microsoft Unify Tour 2007. I planned to be there full day but an important business meeting made me change the plan. Fortunately, the sessions should be available recorded on the Web, so I will be able to check them later. The topic was bang-on – considering that we are in final phase of large project, using distributed network of SQL Server 2005 installations with replication and location was same as usual.

The format chosen for the presenation reminded reality show: introductory video and scripted/acted sequences of implementing the Web 2.0 enabled Web site. Two speakers – Christian Beauclair and Damir Bersinic were personifying the archetype of a developer (“.. sure it works. Just give me admin access to the PDC …”) and a system administrator (” .. sure we are aware that the patch XYZ is out since last month. We are evaluating it and eventually will deploy it in 2-3 months timeframe … “) and walked the audience over very real-life like scenario. It turned up to be much better than one would expect – very convincing and often quite funny. I guess neither of them needed to act much – they just became what was their profession and passion before converting to techevangelism :-). For me personally, the IT-Pro part was more informative and the SCM 2007 looks like really nice piece of software worth looking at.

And now for something completely different (as Monty Python’s used to say): the second presentation I have attended this week (today) was on the alchemy business of open source business organized by OCRI. Couple of names caught my eye – e.g. Mike Milinkovich – executive director of the Eclipse Foundation, Dwight Deugo, assistant professor at the Carleton University and also of the Java Report fame. Too bad that the fame ended in 2001, Java report was the best Java and OOP oriented journal available. I have very high admiration for everything Eclipse and both these gentlemen were involved with the local Ottawa company which designed and created the first version of the platform – so I was curious to listen what they have to say.

From other talks – it was interesting to see the Nortel Chief architect talking about open source. Large companies of the Nortel size have always had uneasy relation with open source software and it really depended on the team and local manager how much of the open technologies was allowed in. We were lucky, had great Nortel manager who understood the value and risks. As result, we were allowed to use what was right for the task and did some pretty amazing stuff with Enhydra application server, Freemarker and Jython back in 1999-2000.

Another surprise for me was Ingres. First of all, that it is still around – and according the the presenter, doing fairly well and growing. Secondarily, that Ingres was re-opensourced – started originally as opensource project back in 1980-ies, then was commercialized and closed, lost the lead against Oracle, IBM and Microsoft and found second life in re-opening the codebase. It was probably the only way – but I am not sure whether the world really badly needs another SQL database – with MySQL and Postgresql and few others already established and available, with larger mind-share and momentum …


You should write blogs

2007/04/18

YOU should write blogs.

Even if nobody reads them, you should write them. It’s become pretty clear to me that blogging is a source of both innovation and clarity. I have many of my best ideas and insights while blogging. Struggling to express things that you’re thinking or feeling helps you understand them better.

From http://steve.yegge.googlepages.com/you-should-write-blogs.


Quote of the day

2007/04/15

C++: an octopus made by nailing extra legs to a dog.
(Steve Taylor).

When C++ is your hammer, everything starts to look like your thumb.
(Unknown author).

I guess somebody got pretty fustrated with good ole C++ 🙂


Time to leave CVS ?

2007/04/14

A friend sent me link two days ago making me aware that Mozilla decided to change the source code control system (= SCCS) they use. When such large and important project as Mozilla moves ahead and changes something so basic and fundamental to development as the SCCS, there must be good reason behind it. What was even more interesting was the system selected: no, it was not Subversion, but something much less established – Mercurial. Actually, until they selected it I was barely aware of its existence.

For many years, the synonym for Version Control was CVS. At least in the open source area and really large projects, CVS was the SCCS. Then things started to change and today if you look around in large open source projects, you will see that CVS and Subversion probably still lead but several really large projects are using very different tools e.g. Linux kernel is using  Git.

So – is it time to review our technology toolkit in such basic tool as source code control ? Writing is on the wall – in one of the few remaining computer magazines in Chapters I glanced over today (forgot the name, something Linux related) – was a review of the current free version control systems. The article was comparing and evaluating RCS, CVS, Subversion, GIT, Bazaar, Monotone. In their evaluation the best marks were assigned to Subversion – it got 9/10 points. Mainly because ecosystem, support, documentation, user base and add-on/tools support. The next runner up was one of the new kids on the block – distributed version control system (DVCS) named Bazaar. Good old CVS ended up with 6 points and granddaddy RCS with 3/10.

Among the new tools, two important new trends are visible: new approaches to workflow by using distributed and decentralized VCS (= DVCS) rather than central server based and shift of the implementation platform from traditional very low level languages (C) and low level static languages (C++, Java) to dynamic and interpreted scripting languages.

Using dynamic, high level languages for the SCCS is natural result of increasing computing power of hardware – which makes speed and memory limitations disappear – as well as new, more complex usage scenarios which need more complicated software to address it. The dynamic languages such as Python or Ruby also have excellent support for networking and support most of communication protocols right out of the box and allow inherent portability between all supported platforms. This was often an issue for older systems written in C/C++. Bazaar – for example – is written in Python and can therefore runs on almost any platform.The shift towards decentralized and distributed systems is a logical continuation of the trend that started with abandoning the explicit (reserved) check-out mode of work. Anybody remembers the joys of Visual SourceSafe and issues when your colleague left for two weeks vacations and left checked-out few critical files ? In the systems using reserved checkout, every programmer had read-only copy of the code and could edit only files explicitly checked out – and checkout was limited to at most one person at any given time. The big change of CVS was allowing everybody having all code writable (in his/her own sandbox) and allowing parallel changes to the same file. The simultaneous changes from different people were merged using two step process – update (transfer the changes from repository to local sandbox) and commit – upload the local changes to single, central repository. The possible conflicts were resolved between update and commit.

What is the fundamental difference between centralized and decentralized VCS ? In my limited understanding of how the DVCS work (as I have not worked with any of them yet on real project): unlike with CVS where there was one central repository and every developer had local copy of single state only (sandbox), with DVCS every developer has own copy of whole repository and therefore access to all versions of all files without depending on the centralized server. Every node is a repository and every commit is (or can be) local. Rather than synchronizing the local sandbox with the central repository, the independent repositories are synchronized. Nodes can synchronize directly as long as the changes are made available to other nodes by “publishing” them- usually using HTTP or SSH/FTP. The content of your repository will thus depend on number of nodes you synchronized with – and their own synchronization history.

New models of interaction are just one of the new features emerging. The DVCS can easily simulate workflows and processes from the traditional VCS but also allows several workflows impossible with VCS – like committing changes and getting differences without any connectivity to remote repository. (To be completely fair, you can do that to certain extent with Subversion – because latest status from repository is locally cached, you can always do diff and see local changes made – or undo them, but only for one revision). The DVCS is also very flexible – allows very dynamic creation of groups and branches and does not have single point of failure.

So what is the answer for the question from the title ? Should we throw away traditional VCS and start using new distributed decentralized tools ? I am not sure. With flexibility and freedom comes overhead and responsibility. Synchronizing large repositories can be expensive – both in time and network bandwidth. What is more important, it may require changes to the project management style and additional effort invested into keeping codebase under control – see the Development metodologies in Bazaar manual. When no repository is central, it is much harder to say where is the latest, most complete version of the system’s code. And – unless you address it with your process – it may be hard to tell whether it is currently even available. The DVCS seems to make sense mostly when the team is very large and geographically distributed. I do not see many advantages for the smaller development team working in one location – other than fixing several CVS/Subversion annoyances and possibly providing better and easier merging. What is also important is tools availability and IDE integration. All this need to be better understood. Right now the answer for me to whether to switch: no.  But to start looking into DVCS and evaluating to see the pros and cons – absolutely yes.

Nice comparison of features of the various SCCS is available here.