Checking out Eucalyptus

I'm installing and deploying an experimental Eucalyptus Cloud. Eucalyptus is an open source framework for creating private cloud systems. I have been involved in Grid computing research for a number years now. Unfortunately there isn't a single Grid middleware that can be easily deployed by common users. Grids of course compared to contemporary cloud systems are more advanced in that they allow the aggregation of multi-site resources, while Eucalyptus aggregates resources in a single site (a local cluster).
I'm mostly interested at the usability aspect, how easy it is to deploy a cloud. I believe the reason why Grid has suddenly become unpopular is because Grid Middleware are not easy to use and deploy.
I will be blogging about my experiences in a series of posts. I will also make the VMs I create publicly available so that you guys can download them and play with a Eucalyptus Cloud.
Cloud Computing, and what it means for the Grid
I'm nowadays attending the 23rd Open Grid Forum in Barcelona, which is one the largest Grid computing events on the calendar. This year there has been quite a stir caused by the rise of cloud computing services such as Amazon Web services which are seen as a competitor to Grids, such as OSG or EGEE.Today at the keynote by Amazon CTO, Werner Vogels, he presented the cogent business logic which underlines Cloud computing, and why he thinks Amazon AWS has been so successful. Amazon AWS allows anyone from little startups (e.g. animoto ) to large scale companies (e.g. Salesforce) to get access to cheap, reliable, fault tolerant and scalable computing infrastructure. Had these companies thought of doing their computing locally, they would have probably spent 80% of their resources on setting up and running the computing infrastructure which would be an undifferentiated service because anyone who wants to do large scale computing has to setup and run the computing infrastructure. The infrastructure costs money and doesn't make any money for the organization, what does make money for the organization is the 20% business logic the company uses its computing infrastructure for, so with Cloud computing companies can get rid of the 70-80% effort on a largely undifferentiating services and focus on the tasks which would eventually make them money! This has been increasingly realized by all quarters of the industry and hence we see major companies jumping on the bandwagon: Microsoft's Live Mesh, Google's App Engine, IBM and SAP's effort with the EU on Reservoir etc.). So from a layman's perspective it does look like Cloud computing is here to stay.But what about Grid Computing? Grid Computing is analogous to cloud computing but with marked differences: Grids are application specific! EGEE Grid infrastructure is designed to run e-Science applications largely anything from particle collision analysis to neuro-image analysis for Alzheimers. Grids impose application development models: High Energy Physics Grids (such as LCG, EGEE, parts of OSG) are mostly designed to work with Grid applications which are a pipeline of tasks, these Grids schedule and map these tasks onto distributed resources and make the results available to the users. Whereas other Grids like (like made with Globus 4.0, OMII, Taverna based etc.) are service Grids. These Grids assume that Grid applications consist of services instead of tasks. Now we have discussed the major division in modern Grid computing: task based vs. service based. There are other paradigms as well, which are variants of each.Cloud Computing does not impose any such restriction: They provide the infrastructure as a black box and it is up to the user how he wants to use it, if he desires he can virtualize an entire Globus Grid on it, at the same time, he may also setup a gLite Grid and work with tasks or he may not use either and just setup a cluster for MPI applications. Hence we can see that Clous definitely have an edge here.But having said all this, I do believe the Cloud computing and Grid computing will happily co-exist in the future. The infrastructure which Cloud computers will use will be Grids themselves, which will not schedule applications rather they will schedule virtual machines which encapsulate the user's application. But what happens if an application consisting of two virtual machines ends up being scheduling with both VMs in different sites or subnets? These issues and others will be tackled by a recent project which also has been covered extensively in this OGF and I will cover it later.
Semantic Web Workshop at HP Labs Bristol
Today I attended the Semantic Web Workshop at HP Labs Bristol. The HP Labs in Bristol, UK are HP's second largest research facility in the World, and one of the largest computing research facility in Europe. The Labs focuses on many different core technologies which include Quantum Information Processing, location based wireless services, semantic web, agent-based computing, privacy and identity management and mobile health care.
So what is "semantic web", most basically it is a set of technologies which aims at making information on the web (which is predominantly human oriented) machine readable. Allowing users to express ontologies and models to make machines and software do the reasoning instead of humans. Semantic applications could for example find patterns of symptoms for a specific disease or they can allow you to find the nearest doctor via a simple search on the internet. For Semantic Web to work, people need to define data about the data, "metadata", in order for machine to process it according to some model. This metadata is usually defined in Resource Description Language (RDF) or in the Web Ontology Language (OWL), which as is claimed by many as more expressive than the more popular RDF. Just as there is SQL to query a Database, there are query languages to query metadata which enables reasoning on the actual data. A popular Query language is SPARQL (pronounced Sparkle), and sure enough this language was designed in HP Labs Bristol. So the building pieces of semantic web are in place, but where are the applications?
For a layman progress on semantic web seems to be going very slow, infact most of the the Web2.0 has nothing to do with Semantic Web, which was seen as the next generation Web. However there are semantic web software all over the place at the backend. At the workshop it was revealed that companies such as Amazon and Google had deployed semantic web technologies and keeping their platform proprietary for competitive advantage. Semantic web technologies exist however they don't exist in such a framework which people can use.
One of the prime projects in HP Labs Bristol, is Jena. Jena is an open source semantic web framework, one of the first of its kind. At the workshop some few more were introduced: Talis, an easy to use framework to develop semantic web applications then there is an effort by Ingent Publishing Cooperation etc.
Many applications were presented in the workshop, which really showed off the capabilities of the Semantic Web. The focus of the Semantic Web community currently seems to be to create user friendly frameworks/development environments where users can develop semantic web applications. Just as there is DreamWeaver/Frontpage/Expression which helps us develop Web applications, similar frameworks are required for the next generation of Web applications.
Overall it was an informative event, especially for me who wasn't versed with Semantic web technologies before.

Me in front of Building 3, HP Labs, and beneath is the main building of the complex
Has AJAX killed Java Applets?
Java introduced the notion of Browser based applications, in the form of Java applets, which included nuances of feature rich desktop applications within a browser! It was a huge success and widely popular during the dot com bubble, but now I rarely hear about new nifty applets? Web Start gained some traction but the furor over it has died down. Rather another technology has taken the reigns now: AJAX, asynchronous Javascript and "XML". AJAX is the marriage of server side scripting languages such as perl or php and client side scripting language such as Javascript perhaps a little flash, imbued into HTML and some little XML or some other mechanism to transfer information, we get a powerful framework to create desktop-like applications on the browser! AJAX has taken computing by storm. The wave of AJAX based startups, acquisitions and mergers has already begun! Are we heading towards another bubble or is this one for real?
AJAX applications are a definite improvement over previous flickering page based applications, and what more, these applications are NOT memory savvy as Java applets used to be, and no runtime environment is required to run such applications, just a compatible, updated browser is enough. Nowadays many more innovative AJAX applications are being built than Java applets, and people are accepting AJAX more readily than ever. This year for the first time I heard people doing AJAX based undergraduate degree projects in Pakistan, there is talk in my research lab of developing AJAX based intefaces to some of the distributed applications developed here. Not long ago, browser based application used to be the exclusive domain of Flash and Java Applets.
So what was wrong with Applets? I don't need to list any reasons, you can check out this poll which was conducted in Java.net, the official Java forums, and complaints ranged from: Too hard to deploy to to slow in loading!
So we can safely say that AJAX has killed Java applets, a significant portion of standard Java.
From Google Trends we can see how much interest people have in the three technologies, Java Applets, Webstart and AJAX, and compare the news reference volume certainly AJAX has ruled the headlines in recent months and years.
As wireless connectivity expands and gets increasingly cheaper, more and more mobile devices will get connected, once this happens the future of J2ME would look bleak. I personally have started using AJAX based services like Airset.com, which provides an intuitive calendar solution that helps you also to remain in touch with your friends and family. The built- in calendar program which came with the PDA, was developed in J2ME and it lacked a lot functionality, and I found it a bit clunky to use. Another drawback of J2ME applications which you won't find in AJAX applications is the mutual incompatibilities between mobile devices. Like I can purchase a game from the internet designed for Symbian mobiles, but when I try to run it in a cellphone using symbian it won't run! After a little bit of investigation it turns out that the game supports Nokia symbian mobiles, and some certain sets NOT Motorola ones, like wise try running a J2ME game for the Motorola E680i on a Nokia N92 it wont work!
The greatest impediment to AJAX of course is the lack of connectivity, once connectivity is seamless and universal, than people will stop using native applications like the one developed in J2ME and start using AJAX ones.
One complaint I have about AJAX services is that many of them run poorly on cellphone/PDAs, this is certainly not a problem with AJAX itself, because services such as Gmail and Airset.com, prove that AJAX can run on supported browsers on mobile devices if the service is well developed.
Linux rises as a Supercomputer Operating System, but faces hurdles in the Desktop
In Nov 1997, 99.2% of the top 500 supercomputers in the world ran Unix.
9 years afterwards, It's market share has been eroded by Linux which now runs on 73.5% of the top 500 supercomputers.
In 1998 Linux made it's debut in the Top500 List, an authoritative list of the top 500 supercomputers in the world.
After 1998, it took Linux 7 years to break the magical 50% mark. As of November 2004, it ran on 60.2% of the top 500 supercomputers, in 2006 it nearly reached the 75% mark. I believe that Linux will go all the way to completely take over Unix's user base. Both OS's run on about 94.4% of the top 500 supercomputers.
The rise of Linux has come largely at the cost of Unix. Which is discernible from the fact that both operating systems are very similar in nature, and they both are operated the same way, thus the cost of switching is minimal. Linux is an open source, stable, secure, multi-user and multitasking operating system all these factors make it ideal for a supercomputer. Supercomputer manufacturers can modify the kernel source to suite their hardware. The kernel is monolithic but has elements of microkernels in that it allows kernel modules, which make kernel development very simple.
Contrasting open source Linux with closed source Unix leads to a major observation: A lot of money is flowing into Linux these days, and Linux has a large active community which contributes to it, enhancing and making the OS better. Whereas the money flowing into Unix is stagnant. After the commercialization of Unix, the operating system has been largely in decline, I expect the decline to continue and see it as irreversible. All these factors conspired to erode Unix market share in the supercomputing world.
Linux has made remarkable progress from a hobbyists project in 1991, to the leading OS in supercomputers and server systems. Linux also has made inroads into the embedded market. Motorola has been very successful marketing Linux based smartphones, like the e680i which I personally use, and have no regrets about buying it.
One area where Linux lacks is the desktop market. Which of course Microsoft rules. There are many things which conspired to make Linux entry into the Desktop difficult! For starters: Weak GUI(KDE is great, but in my opinion increasingly getting bloated), focus on command-line based user interface(no matter how good the GUI is you always have to go back to the command-line at some stage), difficulty of managing and installing Open source software (dependency hell! Gentoo's emerge, yum, conary do a great job at resolving that). But two hurdles I think have been overlooked, that is software piracy and lack of support for Open Source! The most natural user base of an Open Source operating system would be in the Third World. Countries which can not pay for basic amenities for their populations hardly can be expected to purchase proprietary software from multi-billion dollar companies. So open source software provides them a platform with which they can compete with the developed nations. However if I look around in my own country, Pakistan, I hardly see any Linux deployments around. The Government uses MS software in its offices, businesses use MS software, the people use MS software at their homes! MS Office is taught has part of the curriculum at high school. Most of the software is of course pirated, costing about half a dollar per CD (Windows is 1 CD). Because of software piracy MS Windows and related software are ubiquitous in Pakistan. Software Piracy helps proprietary software companies more than it hurts them! Precisely it makes people dependant on their technology! No-one I know is even considering of switching to Linux because they see Windows everywhere around them, they believe using Linux would put them at a disadvantage, so they better stick to Windows, no matter how many virii they get!
Not long ago I had an instructor at university, who was a senior officer in the Pakistan Armed Forces, he happened to be the director of the Army's directorate which was responsible for the IT policy. He was a complete "Microsoft Guy". He never considered any software solution which was not from Microsoft! I once had a discussion about why his directorate was making the Army dependent on Microsoft software, and not take up the initiative of making them "independent" by indigenously customizing open source software to suit their purposes. He replied this was impossible in the current scenario! There are no companies in Pakistan which provide support for Open Source software, and the Army required heavy-duty support, which to them was only being provided by Microsoft. Indeed in Pakistan there are virtually no companies which provide support for open source software. Most universities in Pakistan are busy at churning out software professionals which are adept at Microsoft technologies, and open source figures low on their curriculum. With no support companies it is very unlikely that any enterprise will take the risk to switch to Linux.
Getting Started with Condor
Cluster computing emerged as a field in the early 90s when hardware prices where dropping and PCs where getting more and more powerful. Companies where shifting from large "Mini Computers" to small and powerful "Micro Computers", many people realized that this would lead to large scale wastage of computing power, as computing resource where being more and more distributed. Organizations of today have hundreds of PCs in their offices. Many of them idle most of the time. However organizations of today also require huge computing power to remain competitive, thus the demand for supercomputing solutions which nowadays are largely built on cluster computing ideas remains stable.
However it is also possible to forego the purchase of commercial cluster computing solutions and setup your own cluster using freely available software. This article is about one of those software, developed by University of Wisconsin, called Condor.
Read the full article at Linux Journal
Xen
In the last half century, microcomputers have become increasingly powerful. Server systems have grown so powerful, that many enterprise servers typically are underutilized. Modern computers are sufficiently powerful to use virtualization to present the illusion of running many virtual systems on a single machine. Each virtual system runs a separate operating system instance simultaneously. So, you can run multiple instances of Linux at the same time on the same machine, or you can run combinations of operating systems, such as Linux, FreeBSD, Windows and so on. This has led to a resurgence of interest in Virtual Machine (VM) technology, which has been around for decades on bigger iron
Read the full article on Linux Journal
