.NET on Linux faster than on Windows ? Hmm

2007/06/27

An interesting article on JavaLobby caught my eye today: Do .NET Applications Run Better on Java?

Normally, knowing the not exactly impartial focus of the Java centric site such as JavaLobby or theserverside.com, one should be careful when reading how much Java outperforms .NET. The bias works the other way too – just look at the Theserverside.net or other .NET centric side how much is C# superior :-). With that in mind, I looked at the technical report.

The report was produced by Mainsoft, the company behind the cross compiler product MainSoft for Java EE. The crosscompilation means that the C# or VB.NET code is first compiled into CLR bytecode using standard Microsoft tools and then transformed into Java bytecode using the CLR byte as input. The study was based on fairly large project – 260’000 lines of code. The published result show that translated code running on Java VM and Websphere platform outperformed the .NET stack on both Windows as well as Linux platform.

So far so good. I have no plan to question the results of the test. One could argue that because the evaluation was not done by independent third party, but by the authors of the crosscompiler, the result must be like it was – simply because if the measurement would show that .NET performs better, no report would be published ๐Ÿ™‚

First of all “faster” does not really means much faster. The speed increase measured is 8% in throughput and very much the same for requests per second. 8% increase is much too small to suggest doing anything major with the application, certainly not re-platforming …

Second, comparison based on single port of an application proves absolutely nothing about results of repeating same process for other application or even every application. It can be indication of reality as easy as an exception. I am pretty sure that given the chance to respond, Microsoft or some other party interested in the opposite outcome, could find a C# application that would perform worse after conversion.

More interesting questions is why would you want to do this – replace Win2003 + .NET CLR with some other operating system (Windows/Linux/something else) plus Java plus Websphere. Clearly, performance cannot be the reason – at least not based on this results.

Price is not a good reason either. From cost saving perspective, crosscompiling .NET application to run in Java under Windows makes no sense, because .NET runtime is part of Win2003 license and cost of the license is there in both cases. This leaves using Linux as platform (or some other free alternative). True, Linux is free – but support is not, neither is labor. In real life, initial costs of licenses are small compared to accumulated costs of supporting application in production – and professional support for Windows or Linux is comparably priced. Besides, I bet that savings gained from not paying for Windows license will not cover cost of Websphere license plus the Mainsoft Java EE Entreprise edition license. True, you could use free j2EE server just as Tomcat or Glassfish or JBoss with the free Grasshoper version of the crosscompiler – but you may not get the same performance numbers. I like Tomcat and use it all the time: it is nice, flexible, easy to configure – but, not the fastest servlet container out there (there must be a reason why did Mainsoft pick Websphere with their own enterprise version after all) …

What is the conclusion ? The article above and approach it describes can be a life saviour if you have lot’s of .NET code and *must* for some real reason switch the platform. The reason may be technical – or not – just consider the magic Google did with Linux. The report does hint one possible good reason – moving your application from PC’s to really big machine – multiprocessors (multi means more than 16 these days when desktop machines start getting quadcores ;-)) running Unix or mainframe. The report shows that AIX on Power5+ based system with 4 CPU did ~ 3400 requests per second whereas the PC based did 2335. This would be interesting if the comparison was fair – but it was not. The AIX had 32 GB RAM whereas PC (with Linux or Windows) had 2 GB and you can imagine the price difference in the hardware.

But if there is no really compelling business reason of switching platforms, sticking with Windows when you want to run .NET application may save you lot of work – and most likely some money as well.


MSDN Documentation – the worst in class ?

2007/06/14

Did it ever happened to you that you were using some tool day after day – and never realized it’s pretty big deficiencies ? Until somebody, coming from different background pointed out what everything is wrong with the tool ? Before that moment of revelation, the issues were just inconvenience, but right after that they became real annoyance ?

Exactly this happened to me last week and the credit for pointing out what is wrong with MSDN documentation (and the “standard” .NET documentation format in general) goes to Joel ๐Ÿ™‚

For developer using object oriented language such as C#, Java or Ruby, what you need on daily basis is to find information about a class, see its public interface,members, constructors, method signatures. Ideally, on single page, with possibility of drilling down to the details of a method and to a code example. You also very often need to see all implemented interfaces, have easy access to parent class and (in case of e.g. an interface inside a framework) to access the implementing classes or derived classes within this context.

Unfortunately, the Microsoft .NET documentation makes this simple task not exactly easy, pleasant or fast. As an example, lets take something really simple e.g. DateTime struct. In the documentation, information about this simple class are separated to 6 pages: datetime structure itself, Members, Fields, Constructor, Methods and Properties. If you would expect that with this devotion of low level categorization the particular page for e.g. Methods will give you all details about all DateTime methods, you are wrong. What Methods page gives you is just list of names, not even a method signatures – parameter types and return values are missing. To get this information, you must click through into the page dedicated to that method. If the method is overloaded (take e.g. omnipresent ToString), the Methods page contains only one name and only the next page gives you the signatures, linked to another page with details. See for yourself

picture-2.png

In addition to bad information structuring, almost every link causes full page reload.

Compare with how much more usable is the Java documentation: it is very easy to see all interfaces, methods, constants, parent classes, implemented interfaces in single page. The dated frames-based UI actually makes lots of sense and is (except AJAX based dynamic site) much better way how navigate the documentation.

With all that said, I am not surprised that the tools such as Reflector are so extremely popular in .NET world. It not only provides very useful debugging/inspection tool, but thanks to excellent and compact presentation of information about the class retrieved from reflection, they are the fastest way how to get meaningful information on core classes API. Other than Reflector, the other fast way how to get information on .NET core library details is Google search.

Try for example how fast you can access a documentation for particular class starting with Google search – let’s take e.g. WebConfigurationManager. Google search returns blazingly fast (as always) – with the MSDN page as first hit. Now compare how fast you will get the same information starting from MSDN home page, (which is btw, advertising ‘new and improved search and navigation‘). Your mileage may vary, but I usually see 3-8 seconds delay in search response (compare to <0.5 sec for Google). Few seconds seems like no problem, but when you do it all the time, it easily becomes pretty annoying. Even more so when you realize that Google is searching WHOLE WEB, with content they do not own or control, only index and rank, whereas MSDN search is searching MSDN data repository, which is – however you measure it – by many orders of magnitudes smaller and Microsoft fully controls most of it’s content.

Why cannot the largest and most powerful software company create documentation that is useful and usable ? Even the documentation for the OpenSource Mono project (port of .NET to Linux and other platforms) is *much* better than the original. See the class DateTime there for comparison: the menu is dynamic and does not reload page every time you click on a link, the methods have full signatures and everything is on single page with local links and only details are on second level page.


Rails’ ideas everywhere

2007/05/09

Since I got through “Rails immersion”, I keep seeing the implementations of the same idea everywhere. Latest find is Subsonic, which implements Active Records in .NET space. Rather than writing more, see this screencast.


New great podcast

2007/04/27

Running out of Security Now! and TWIT episodes, I have subscribed and started listening to few newly discovered podcasts.

I have started to listen to the .NET specific one from Scott Hanselman named Hanselminutes. There are couple of factors that makes it better that other geeky blogs out there. First reason is the content quality. There is very high signal-to-noise ratio, pretty much all content counts and both guys in addition to being smart are quite good at achieving right balance between keeping on topic and spontaneity.

Second reason – it is not technology-religious and quite pragmatic. Scott obviously likes .NET and is passionate about Microsoft technologies – but there is no sucking-up, Scott is very open minded – just listen to the Dynamic languages where they talk about Ruby on Rails. He even owns a Mac and tests multiplatform software on multiple platforms ๐Ÿ™‚

Third reason: there is PDF transcript available with lots of good links which would be otherwise lost (unless you listen with pen in your hand and in front of computer, and not driving or walking as myself.)

And last but not least – very good audio quality, professionally recorded and processed. After listening to this episode about professional audio processing, it was clear why. Episodes are reasonably short – 20 to 40 minutes

I have learned quite a lot from the about variety of interesting things – e.g that WPF/E (recently named Silverlight) may be actually something I really want to look at :-). It almost sounded too good to be true.

So if .NET is part of your world – or you want it to become part of your world – go for HanselMinutes.


Very useful data structure and algorithms library

2007/04/02

.NET 2.0 offers very rich and nicely designed library of core data structures, collections and algorithms. Occasionaly, you run into situation when you need something not in there. Before starting to design you very own extension of LinkedList or HashTable, look into interesting open-source project NGenerics – chances are you will find it there.

It contains quite a few new data structures:

extensions of existing data structures to work with Visitor pattern

and implementation of algorithms – sorting:

and general:

Nicely written, documented, comes with unit tests ๐Ÿ™‚ and under very liberal license.

See also author’s blog and the article on Codeproject he wrote. Thanks Riaan, your code is appreciated.

Btw, speaking of Fibonacci, did you know Fibonacci was only his nickname and the real name of this Italian mathematician was Fibbooonnnnnaaaaaaaaccccccccccccciiiiiiiiiiiiiiiiiiiii ? ๐Ÿ™‚


Converting eBooks to Sony Reader format

2007/03/22

Since yesterday, I made nice progress in solving my issues with content creation for PRS500 and it’s readability. There are several ways how to proceed:

The simplest is to download Book Designer. It is free for non-commercial use and current version 5.0 Alpha does the job very well. It allows you to load source in text, HTML, Lit, PDF, PalmDoc (prd/prc), rb and few other formats and process them into native LRF format – plus few others I do not really care about. The result is nice, readable LRF file with three sizes, nicely formatted, with metada. As added benefit, because the author is Russian, the program does not assume that English alphabet is the only one in existence and allows to select encoding. The result is quite good – most of the extended characters from Czech/Slovak are there, some are missing and displayed as space (namely ล™,ฤ›,ฤพ …) but it is readable. What is maybe better option is that with English as language and default encoding, the software “downscales” the extended characters to closest English pairs: ล™ -> r,ฤ› -> e – which results in familiar computer Czech/Slovak. I am very comfortable with option 2, and will work on getting correct font for #1.

If you want to read more about the program go here and here – as long as you can read Russian. I found out that even after 22 years of not using Russian, I can still reasonably well read and understand it …

The program is useful for creating Palmbooks as well as Microsoft Reader Lit book. I did not try that yet. User interface of Book Designer is not exactly Apple-made – extremely technical,ย  geekish – looking like designedย  by engineer for engineers ๐Ÿ™‚ย  – here is how it looks like.ย  But it is the functionality that counts. Thank you – whoever made this possible :-).

If you want actually understand how the LRF format works and how the book is formatted on very low level, read the format spec and then download the BBeBinder from Google Code. It is C# 2.0 project, which aims to create something similar that BookDesigner – but as opensource, GPL-ed application. It is very early version (0.2) but in the true spirit of opensource, it actually (mostly) works. I have downloaded it and looked inside the code. The solution contains BBeB encoding/decoding library and main program, which was nicely designed with extensibility in mind. Using plugins, it allows to add additional input data formats (currently works well for text files, some HTML and I had mixed results with others).

If both of my projects were not in C# space (which is causing me being slightly over-see-sharped at the moment), I would not mind volunteering few hours into this – to make sure that Central European encoding is handled OK :-).


Times are a-changing

2007/03/12

This year unusually early, we switched the Daylight Saving Time. The clocks did spring ahead one hour on Sunday morning. Btw, this is a good mnemonic how to remember which direction does the clock go: it Springs ahead and it Falls back … Technically, it was decision of the US legislative to start DST early but because of the integration of the two economies, Canada had not much choice, just to follow. So, for next three weeks or so, we are one hour closer to Europe.

As a coincidence, we have been working with time and timezones related issues in both my C# related projects. Both project deal with data captured in different geographical locations and need to interpret the data timestamps from the point of view of user in particular timezone. In theory, .NET offers good support for time zones. In reality, the support is not really that great after all.

What you can do very easily is to convert between local time of the Windows client and UTC. What you cannot do easily is to convert between any particular timezone and UTC – the conversion functions do not accept timezone code as parameter. What you also cannot do right out of the box is to provide user with list of all time zones and access the information like offset, date and time of DST etc (btw – both of these problems are much simpler in Java). Yes, it is not too hard to do it in C#, but every solution I have seen leads to one of two problems:

a) provides .NET wrapper using Windows timezone database in registry

b) creates own timezone information (in database, XML file, Web service – you name it).

Neither of these solution is really clean. Windows registry database is incomplete (contains only 75 zones) and it’s accuracy depends on whether the Windows update is enabled or not (the Windows workstations installed in October without automatic update on did NOT recognize the early DST start this year). Accessing the registry may lead to priviledge problems and is generally problematic in Web applications. The information in the registry is pretty cryptic and the “unique key” is non-descriptive integer with no external world relation (the Index). If you can live with these limitations, look at this Codeproject article and on Michael Brumm’s SimpleTimeZone class.

If you want to create and maintain own timezone database, you need first to get the authoritative source of the TZ data and then make sure you keep it up to date. If you use on-line source, you require permannent connectivity or some sort of synchronization – which adds amount of work required. Good starting point is this webpage.


Introduction to CSLA – lunch presentation at ODNC

2007/02/28

The Ottawa .NET Community organized today interesting lunch-and-learn session: first of two sessions on introducing the Rocky Lhotka’s CSLA Framework. Usual location – Glacier Room at downtown Microsoft Office. I decided to attend for two reasons: first was that I knew the presenter – David Campbell. David is a great guy – consultant with 20 years of history in the software development business, running his own company. Last year when we were looking for people with CSLA experience in Ottawa, David’s name came out first.

My second reason was a educational. No, I did not really expect to learn something new about CSLA. We are using the framework on two projects since summer 2006, I have read the book (and agree with David that it is good but a pretty dry read) and also read and wrote quite some amount of code that uses the framework. I certainly do not think that I know enough about CSLA (as the German proverb says, man lernt nie aus ) but it is hard to go during introductory session to the level of detail that uncovers anything new to active users of the CSLA. What I was looking for is an inspiration how to present the framework to the developers who are new to it – and I was not disappointed. And btw, I *did* learn something completely new about CSLA: that SmartDate trick with “.”, “+” and “-” is really neat (see the source code of SmartDate unit tests).

What I always enjoy on the ODNC sessions is discussion during and after the presentation. It was like this last time (Adam Machanic) and it was like that today. People ask great questions (OK – with one expection – if you are ODNC regular, you know who I am talking about).

We have had lots of discussion in-house about the relative pros and cons of using CSLA. In our projects, portions of the CSLA functionality are not so important: we do not really need multi-level undo, for one example. On the other hand, the location-independent business objects and scalability it gives you is really nice. Yes – CSLA forces you to do things certain way, which may not be considered ideal, but at least it results in consistent approach across the codebase.

CSLA has pretty steep learning curve, even with the books available and the way of doing things can look strange to seasoned object oriented designer. Heavy use of generics and design of the core framework classes forces you to use very flat object hierarchies. Instead of inheritance, it pushes either towards sharing functionality with wrapping or using code generation. I am not exactly crazy about the read-only/readwrite object dichotomy – without use of inheritance, it often leads to code duplication.

Also the book example on Projects and Resource is IMHO not the most illustrative one: it puts too much emphasis on dealing with lists of objects and does not illustrate many important aspects of dealing with root objects and switchable (root/child) objects. I had trouble of using this example for in-house training and mentoring: it is not simple enough to make obvious things really obvious and not comprehensive enough to cover many everyday situations.

Despite of all that, our experience with CSLA was overly positive: we were very pleased with performance compared to plain datasets and after few initial hick-ups, the framework allows you to create very solid, reusable, scalable layer of mobile business objects.

David is going to do Part Deux of the CSLA intro, which should be practical exercise of creating address book applicatopn based on CSLA with multiple UI. Looking forward to it – maybe that example will fill the gap …

And btw – thanks, David.


Hiring programmers – or Degrees of Done

2007/02/21

Back in old country, I once had a programmer working for me who made himself famous with the following quote:

“I have fixed that bug, do you want me to compile it too ?”.

In his mind, he was done as soon as he identified the bug and put the fix in the code. All the rest was trivial and unimportant routine.

During last 20 years I have worked with many very different developers and learned the hard way, how very many different meanings the word “done” can have. For some, done means that coding just finished and he/she successfully compiled the projects and stepped through some of the code paths with debugger with negligible occurrences of major crash. Whether the code is in CVS, commented – who cares ?

For others it meant that unit test was written and run, code is commented and documented, everything is properly tagged and version-ed in source control system, built and deployed via automated build system (now do NOT laugh, the later are not mythical creatures, I really met few guys like that). The problem is that it takes some time while you find out who is where on the done-ness scale and making sure that the whole team is on the same degree of “done-ness” or at least in agreement what done means, can present quite some challenge and consume considerable amount of time from the project manager and/or team lead.

What is your degree of done is a great question to ask when you are building new team and hiring developers. We had interesting discussion with Connie on the topic of hiring criteria and both came to a conclusion that biggest mistake you can make is to put too much weight on best technical skill-set match. The longer is the project, the less emphasis should be placed on skillset match – simply because skills will evolve but non-technical personality treats do not change – and in the long run, these may cause most troubles. For smart person, with good education and wide enough experience it is much easier to fully master some special area of expertise, especially when there is some time available. On the other hand, things like teamwork, work ethic, social skills, communication or “degree of done” are very hard to get and even harder to fix/develop.

The other reason why hiring only by skill match can be tricky is: how do you evaluate a supposed expert’s expertise ? If do not have another expert (ideally better one), how do you validate how much of claimed experience is truly there and how much is just a nice wrapping, an empty shell lacking depth ? The risk in hiring somebody with boasted experience but lack of depth is that this person will inevitably become the key design/implementation influencer and decision maker in the given area of expertise. Eventually, as the project moves ahead, other team members will catch up and understand more and will question some of expert decision made – and the fun begins.

Another important “must have” for a good hire is multicultural background. No, I do not mean *that* type of multiculturalism as the politicians like to use: I mean the information technology cultures such as platforms, operating systems, languages and the person’s exposure to multiple of them.

If you have a guy who worked on non-Microsoft platforms, it will be much easier to make him understand why the automated builds, scripted installs and all that old fashioned command line stuff is so important – because that person will see the value and power of scripting, repeatability and automation. This is very hard to explain to a guy who was all his life just clicking buttons and checkboxes in GUI tools and the black box of command line prompt either scares or annoys him. (OK, let’s be fair here: until Powershell’s availability, the command prompt in Windows was both scary in its ineptitude and annoying compared to e.g. Bash)

If you have a seasoned Oracle database guy who used the database on Solaris, Linux and Windows, you will never have to mentioned and explain that you need scripts for things like create schema, load demo data etc. You will get nice, handwritten, commented, parametrized scripts for everything whether you ask for it or not. These guys just operate that way because until recently there was no GUI and because this is fastest way (hi Bob :-)). On the other hand – ask a SQL Server database developer to work using “script first” approach and you may find out that it is something not so obvious, the guy does not get it (just keeps on clicking in Management Studio) and that the scripts you eventually get are far from good – because they are generated from changes done via GUI, not really written. To be left maintaining such code – good luck.

So what is the best hire ? Smart people, with good education and problem solving capabilities, with solid verifiable experience, wide multicultural background (in the computing sense), with proven record of working in teams of various sizes and ideally well versed in business area you are working in, with work ethic compatible to your organization, shared communication culture and similar degree of done. It does not really matter how many keywords does the resume match: if you need to have great C# programmer and have a choice of experienced Java guy who never touched .Net and 10 year senior Windows programmer who spend most of his/her career on Visual Basic like platforms, it will typically end up like this: Pick a good Java guy, give him 4-6 weeks and the reward will be beautiful, maintainable code with very natural C-sharpness. Pick a VB guy, you will “save” the start time – but you will more likely get average, harder to manage, less scalable code because the object design and object thinking just are not there to the same degree as in Java case. The longer the time was spent on VB version 6 or less, the worse.

An advanced apology to all my VB.NET loving, VB.NET writing friends – please no flame wars, all of you absolutely are the exceptions confirming the rule, you are the crowd who gets the OOP, and besides we all know that VB.NET is just syntactic sugar on top of C#, of course the “better” type of sugar :-). I am not picking on you. I said VB but I really meant Perl …

And btw, Tidlo – if you are reading this, send me an email. Hope you are still programming and occasionally compiling after you fix the bug.


DotNet Development Toolbox

2007/02/18

I have recently picked very interesting book that I have read about year ago, from Mike Gunderloy named Coder to Developer, with subtitle Tools and Strategies for Delivering Your Software. It is an excellent book and I highly recommend to give it a look if you are in software development business on Microsoft platform. Easy read, practical, useful.When I was reading it back in 2006, it was before we have set up own development lab and started the biometric project. With this recent experience, I was re-reading the book with quite different view: unlike before, I knew what we tried, what worked and what did not. Unlike before, I have now broader experience with what it means developing software and run large project inside the organization, creating and maintaining infrastructure for team of developers and leading the project design and implementation.

Book has 14 chapters, addressing various areas from starting a new project, organizing it, using tools such as source control, unit testing, IDE, bugtracking, logging, build tools etc. While starting up the lab and the biometric project, we had to go through pretty much every book chapter. Sometimes we made same choice as Mike recommends, sometimes different. Here is our toolbox, in order of the book chapters:

Ch-3 – Source code control
Mike is mentioning several Source code control systems: BitKeeper, ClearCase, CVS, Subversion, Perforce, VSS. We have excluded VSS (because of the reliability), ClearCase (because of complexity and price) and the final selection was done between CVS and Subversion. Actually, between CVS-NT and Subversion. We have decided to adopt both, starting with CVS, because it was more familiar for majority of the team members.

At the beginning, we were considering VSTS, but the prohibitive price, complexity and low version number were reason why we decided to wait for at least Service Pack 2 before considering it.

Ch-5 – Unit Testing

Compared to other projects, we managed to be quite successful in implementing unit tests and TDD. The BO layer has over 100 unit tests which helped to catch several pretty vicious bugs in early days. We settled upon using MbUnit, instead of NUnit, because of very useful extensions such as row tests. MbUnit is pretty much superset of NUnit – see more at project Wiki page. MbUnit works very well, the only disadvantage is very limited documentation.

Ch-6 – IDE

Visual studio – what else ? In version 2005 it provides very good featureset. Refactoring support is still not quite as comprehenshive as in Eclipse, but very good nevertheless. Some developers use special plugins – e.g. Visual AssistX, but others (including myself) found it rather contraintuitive and liked plain VS.NET better.

One of the plugins that we evaluated and actually liked, was Testdriven.NET – nice VS addin to use MbUnit, csUnit and NUnit. Unfortunately, the publisher has very strange licensing policy: the more licenses you buy, the more you pay : Professional license costs $95, but if you need more than one you must purchase Enterprise version ($135). Out of principle, to be “voting with our dollars” we decided not to go with Testdriven.Net. Let’s hope that more people will do the same and eventually, the author will get the message.

To be continued