.NET on Linux faster than on Windows ? Hmm

2007/06/27

An interesting article on JavaLobby caught my eye today: Do .NET Applications Run Better on Java?

Normally, knowing the not exactly impartial focus of the Java centric site such as JavaLobby or theserverside.com, one should be careful when reading how much Java outperforms .NET. The bias works the other way too – just look at the Theserverside.net or other .NET centric side how much is C# superior :-). With that in mind, I looked at the technical report.

The report was produced by Mainsoft, the company behind the cross compiler product MainSoft for Java EE. The crosscompilation means that the C# or VB.NET code is first compiled into CLR bytecode using standard Microsoft tools and then transformed into Java bytecode using the CLR byte as input. The study was based on fairly large project – 260’000 lines of code. The published result show that translated code running on Java VM and Websphere platform outperformed the .NET stack on both Windows as well as Linux platform.

So far so good. I have no plan to question the results of the test. One could argue that because the evaluation was not done by independent third party, but by the authors of the crosscompiler, the result must be like it was – simply because if the measurement would show that .NET performs better, no report would be published πŸ™‚

First of all “faster” does not really means much faster. The speed increase measured is 8% in throughput and very much the same for requests per second. 8% increase is much too small to suggest doing anything major with the application, certainly not re-platforming …

Second, comparison based on single port of an application proves absolutely nothing about results of repeating same process for other application or even every application. It can be indication of reality as easy as an exception. I am pretty sure that given the chance to respond, Microsoft or some other party interested in the opposite outcome, could find a C# application that would perform worse after conversion.

More interesting questions is why would you want to do this – replace Win2003 + .NET CLR with some other operating system (Windows/Linux/something else) plus Java plus Websphere. Clearly, performance cannot be the reason – at least not based on this results.

Price is not a good reason either. From cost saving perspective, crosscompiling .NET application to run in Java under Windows makes no sense, because .NET runtime is part of Win2003 license and cost of the license is there in both cases. This leaves using Linux as platform (or some other free alternative). True, Linux is free – but support is not, neither is labor. In real life, initial costs of licenses are small compared to accumulated costs of supporting application in production – and professional support for Windows or Linux is comparably priced. Besides, I bet that savings gained from not paying for Windows license will not cover cost of Websphere license plus the Mainsoft Java EE Entreprise edition license. True, you could use free j2EE server just as Tomcat or Glassfish or JBoss with the free Grasshoper version of the crosscompiler – but you may not get the same performance numbers. I like Tomcat and use it all the time: it is nice, flexible, easy to configure – but, not the fastest servlet container out there (there must be a reason why did Mainsoft pick Websphere with their own enterprise version after all) …

What is the conclusion ? The article above and approach it describes can be a life saviour if you have lot’s of .NET code and *must* for some real reason switch the platform. The reason may be technical – or not – just consider the magic Google did with Linux. The report does hint one possible good reason – moving your application from PC’s to really big machine – multiprocessors (multi means more than 16 these days when desktop machines start getting quadcores ;-)) running Unix or mainframe. The report shows that AIX on Power5+ based system with 4 CPU did ~ 3400 requests per second whereas the PC based did 2335. This would be interesting if the comparison was fair – but it was not. The AIX had 32 GB RAM whereas PC (with Linux or Windows) had 2 GB and you can imagine the price difference in the hardware.

But if there is no really compelling business reason of switching platforms, sticking with Windows when you want to run .NET application may save you lot of work – and most likely some money as well.


MSDN Documentation – the worst in class ?

2007/06/14

Did it ever happened to you that you were using some tool day after day – and never realized it’s pretty big deficiencies ? Until somebody, coming from different background pointed out what everything is wrong with the tool ? Before that moment of revelation, the issues were just inconvenience, but right after that they became real annoyance ?

Exactly this happened to me last week and the credit for pointing out what is wrong with MSDN documentation (and the “standard” .NET documentation format in general) goes to Joel πŸ™‚

For developer using object oriented language such as C#, Java or Ruby, what you need on daily basis is to find information about a class, see its public interface,members, constructors, method signatures. Ideally, on single page, with possibility of drilling down to the details of a method and to a code example. You also very often need to see all implemented interfaces, have easy access to parent class and (in case of e.g. an interface inside a framework) to access the implementing classes or derived classes within this context.

Unfortunately, the Microsoft .NET documentation makes this simple task not exactly easy, pleasant or fast. As an example, lets take something really simple e.g. DateTime struct. In the documentation, information about this simple class are separated to 6 pages: datetime structure itself, Members, Fields, Constructor, Methods and Properties. If you would expect that with this devotion of low level categorization the particular page for e.g. Methods will give you all details about all DateTime methods, you are wrong. What Methods page gives you is just list of names, not even a method signatures – parameter types and return values are missing. To get this information, you must click through into the page dedicated to that method. If the method is overloaded (take e.g. omnipresent ToString), the Methods page contains only one name and only the next page gives you the signatures, linked to another page with details. See for yourself

picture-2.png

In addition to bad information structuring, almost every link causes full page reload.

Compare with how much more usable is the Java documentation: it is very easy to see all interfaces, methods, constants, parent classes, implemented interfaces in single page. The dated frames-based UI actually makes lots of sense and is (except AJAX based dynamic site) much better way how navigate the documentation.

With all that said, I am not surprised that the tools such as Reflector are so extremely popular in .NET world. It not only provides very useful debugging/inspection tool, but thanks to excellent and compact presentation of information about the class retrieved from reflection, they are the fastest way how to get meaningful information on core classes API. Other than Reflector, the other fast way how to get information on .NET core library details is Google search.

Try for example how fast you can access a documentation for particular class starting with Google search – let’s take e.g. WebConfigurationManager. Google search returns blazingly fast (as always) – with the MSDN page as first hit. Now compare how fast you will get the same information starting from MSDN home page, (which is btw, advertising ‘new and improved search and navigation‘). Your mileage may vary, but I usually see 3-8 seconds delay in search response (compare to <0.5 sec for Google). Few seconds seems like no problem, but when you do it all the time, it easily becomes pretty annoying. Even more so when you realize that Google is searching WHOLE WEB, with content they do not own or control, only index and rank, whereas MSDN search is searching MSDN data repository, which is – however you measure it – by many orders of magnitudes smaller and Microsoft fully controls most of it’s content.

Why cannot the largest and most powerful software company create documentation that is useful and usable ? Even the documentation for the OpenSource Mono project (port of .NET to Linux and other platforms) is *much* better than the original. See the class DateTime there for comparison: the menu is dynamic and does not reload page every time you click on a link, the methods have full signatures and everything is on single page with local links and only details are on second level page.


Rails’ ideas everywhere

2007/05/09

Since I got through “Rails immersion”, I keep seeing the implementations of the same idea everywhere. Latest find is Subsonic, which implements Active Records in .NET space. Rather than writing more, see this screencast.


New great podcast

2007/04/27

Running out of Security Now! and TWIT episodes, I have subscribed and started listening to few newly discovered podcasts.

I have started to listen to the .NET specific one from Scott Hanselman named Hanselminutes. There are couple of factors that makes it better that other geeky blogs out there. First reason is the content quality. There is very high signal-to-noise ratio, pretty much all content counts and both guys in addition to being smart are quite good at achieving right balance between keeping on topic and spontaneity.

Second reason – it is not technology-religious and quite pragmatic. Scott obviously likes .NET and is passionate about Microsoft technologies – but there is no sucking-up, Scott is very open minded – just listen to the Dynamic languages where they talk about Ruby on Rails. He even owns a Mac and tests multiplatform software on multiple platforms πŸ™‚

Third reason: there is PDF transcript available with lots of good links which would be otherwise lost (unless you listen with pen in your hand and in front of computer, and not driving or walking as myself.)

And last but not least – very good audio quality, professionally recorded and processed. After listening to this episode about professional audio processing, it was clear why. Episodes are reasonably short – 20 to 40 minutes

I have learned quite a lot from the about variety of interesting things – e.g that WPF/E (recently named Silverlight) may be actually something I really want to look at :-). It almost sounded too good to be true.

So if .NET is part of your world – or you want it to become part of your world – go for HanselMinutes.


Very useful data structure and algorithms library

2007/04/02

.NET 2.0 offers very rich and nicely designed library of core data structures, collections and algorithms. Occasionaly, you run into situation when you need something not in there. Before starting to design you very own extension of LinkedList or HashTable, look into interesting open-source project NGenerics – chances are you will find it there.

It contains quite a few new data structures:

extensions of existing data structures to work with Visitor pattern

and implementation of algorithms – sorting:

and general:

Nicely written, documented, comes with unit tests πŸ™‚ and under very liberal license.

See also author’s blog and the article on Codeproject he wrote. Thanks Riaan, your code is appreciated.

Btw, speaking of Fibonacci, did you know Fibonacci was only his nickname and the real name of this Italian mathematician was Fibbooonnnnnaaaaaaaaccccccccccccciiiiiiiiiiiiiiiiiiiii ? πŸ™‚


Converting eBooks to Sony Reader format

2007/03/22

Since yesterday, I made nice progress in solving my issues with content creation for PRS500 and it’s readability. There are several ways how to proceed:

The simplest is to download Book Designer. It is free for non-commercial use and current version 5.0 Alpha does the job very well. It allows you to load source in text, HTML, Lit, PDF, PalmDoc (prd/prc), rb and few other formats and process them into native LRF format – plus few others I do not really care about. The result is nice, readable LRF file with three sizes, nicely formatted, with metada. As added benefit, because the author is Russian, the program does not assume that English alphabet is the only one in existence and allows to select encoding. The result is quite good – most of the extended characters from Czech/Slovak are there, some are missing and displayed as space (namely Ε™,Δ›,ΔΎ …) but it is readable. What is maybe better option is that with English as language and default encoding, the software “downscales” the extended characters to closest English pairs: Ε™ -> r,Δ› -> e – which results in familiar computer Czech/Slovak. I am very comfortable with option 2, and will work on getting correct font for #1.

If you want to read more about the program go here and here – as long as you can read Russian. I found out that even after 22 years of not using Russian, I can still reasonably well read and understand it …

The program is useful for creating Palmbooks as well as Microsoft Reader Lit book. I did not try that yet. User interface of Book Designer is not exactly Apple-made – extremely technical,Β  geekish – looking like designedΒ  by engineer for engineers πŸ™‚Β  – here is how it looks like.Β  But it is the functionality that counts. Thank you – whoever made this possible :-).

If you want actually understand how the LRF format works and how the book is formatted on very low level, read the format spec and then download the BBeBinder from Google Code. It is C# 2.0 project, which aims to create something similar that BookDesigner – but as opensource, GPL-ed application. It is very early version (0.2) but in the true spirit of opensource, it actually (mostly) works. I have downloaded it and looked inside the code. The solution contains BBeB encoding/decoding library and main program, which was nicely designed with extensibility in mind. Using plugins, it allows to add additional input data formats (currently works well for text files, some HTML and I had mixed results with others).

If both of my projects were not in C# space (which is causing me being slightly over-see-sharped at the moment), I would not mind volunteering few hours into this – to make sure that Central European encoding is handled OK :-).


Times are a-changing

2007/03/12

This year unusually early, we switched the Daylight Saving Time. The clocks did spring ahead one hour on Sunday morning. Btw, this is a good mnemonic how to remember which direction does the clock go: it Springs ahead and it Falls back … Technically, it was decision of the US legislative to start DST early but because of the integration of the two economies, Canada had not much choice, just to follow. So, for next three weeks or so, we are one hour closer to Europe.

As a coincidence, we have been working with time and timezones related issues in both my C# related projects. Both project deal with data captured in different geographical locations and need to interpret the data timestamps from the point of view of user in particular timezone. In theory, .NET offers good support for time zones. In reality, the support is not really that great after all.

What you can do very easily is to convert between local time of the Windows client and UTC. What you cannot do easily is to convert between any particular timezone and UTC – the conversion functions do not accept timezone code as parameter. What you also cannot do right out of the box is to provide user with list of all time zones and access the information like offset, date and time of DST etc (btw – both of these problems are much simpler in Java). Yes, it is not too hard to do it in C#, but every solution I have seen leads to one of two problems:

a) provides .NET wrapper using Windows timezone database in registry

b) creates own timezone information (in database, XML file, Web service – you name it).

Neither of these solution is really clean. Windows registry database is incomplete (contains only 75 zones) and it’s accuracy depends on whether the Windows update is enabled or not (the Windows workstations installed in October without automatic update on did NOT recognize the early DST start this year). Accessing the registry may lead to priviledge problems and is generally problematic in Web applications. The information in the registry is pretty cryptic and the “unique key” is non-descriptive integer with no external world relation (the Index). If you can live with these limitations, look at this Codeproject article and on Michael Brumm’s SimpleTimeZone class.

If you want to create and maintain own timezone database, you need first to get the authoritative source of the TZ data and then make sure you keep it up to date. If you use on-line source, you require permannent connectivity or some sort of synchronization – which adds amount of work required. Good starting point is this webpage.