Category Archives: Cloud Computing

Windows Azure Online Workshop

Microsoft is offering a free, two-hour online workshop for developing apps for Windows Azure.  The course includes a full access Azure account for 2 weeks.  Seems like a good opportunity to learn how to develop apps for Microsoft’s cloud platform.

If you are in the Boston area, there is also a one-day event on May 8th at Microsoft NERD called Boston Azure Firestarter.  The event will feature classes and labs on developing for the Azure platform.

You will need to have Visual Studio 2008 or Visual Studio 2010 installed, as well as the Azure Tools for Visual Studio.


R&D IT Best Practices for Growing Small/Mid-Sized Biopharmas

The Mass Technology Leadership Council put on a round table discussion on best practices for small/mid sized biopharmas at Microsoft NERD on April 14.  The agenda is shown below

As smaller biopharma companies grow and mature into larger organizations, they need to transition from relatively ad hoc, spreadsheet heavy, lightly supported R&D IT environments to more comprehensive, scalable platforms and support models. This presents substantial challenges to scientists, managers, and IT professionals, ranging from prioritization of new features and functions, creation of new workflows, budget and staffing issues, in-house platforms versus COTS (commercial-off-the-shelf) platforms, and getting support for necessary changes. Our expert panel will discuss these issues and more in this MA Technology Leadership Council Life Science Cluster roundtable discussion.

The discussion covered a range of topics from virtualization of IT, data management, build vs. buy, and collaborating with scientists.

Virtualization and Cloud Computing

Cloud computing is a very hot topic these days and there is obviously a lot of interest in this area for small and mid-sized biopharmas.  Being able to pay-as-you go has a lot of appeal and the reduced infrastructure costs are appealing.  Being comfortable with having your hardware, software, and especially your data outside of your direct control is still an issue for a lot of people.  Regulatory issues like HIPAA and protecting your intellectual property are still concerns for people.  Uploading large data sets, on the order of gigabytes, is still not a workable solution and it is often easier and cheaper to ship the data sets to be uploaded.

There was a lot of discussion about security of data and whether you are more secure having your data hosted by some organization that (hopefully) has significant resources dedicated to security or are you more at risk with having data in house.  A lot of data and IP leaves most companies every day in the form of laptops, USB sticks, email attachments, etc, which presents its own set of security concerns.

Collaborating With Scientists

There was a very good discussion about how to collaborate with the scientists to build tools to manage data.  The take-away from this discussion was to understand their work flow and to learn what the do instead of just asking what they need.  By understanding their workflow you are in a better position to understand their needs.  One of the panelists made an interesting comment that scientists tend to feel that what they do is unique, and that is why there may be some resistance to a commercial-off-the-shelf (COTS) solution.  They also tend to shut down when given a complete solution, and are much more receptive to being part of the process of defining the solution.

As a practitioner of Agile, this makes perfect sense to me.  You can’t deliver a solution to your end users if you don’t understand their workflow and they are an integral part of defining the solution.  Also by building solutions incrementally, you allow for refinement and early feedback from your end-users.

Data Management Policies

One of the themes that all the panelists touched on was trying to deal with data being spread across the company in a variety of formats.  When a company is in its early stages, lots of “data” resides in spreadsheets and PowerPoint presentations, and most people know where it is or can give you enough clues to find it, e.g. “Bob did a presentation back in May that….”.

As companies grow this data becomes harder and harder to manage.   Content management tools like SharePoint and Documentum, or LIMS systems can address issues like these, but there are setup and maintenance costs that can be an issue for a small start-up.

Probably the biggest takeaway for me from the entire session was the notion of putting a data management strategy in place early on.  The strategy would address issues like where/how to store data.  A policy would not need to be restrictive, just some basic rules.  That way when the need to add tools and support to manage the data arises, incorporating the data into the system will be easier.

Coming at it from an Agile perspective, this makes a lot of sense.  Do the simplest thing, i.e. a basic policy of how to store your data, before going to a heavyweight approach of a full blown LIMS or content management system.  The key here would seem to be to make sure the policy is easy to understand and implement and is not a barrier to innovation.  The last thing you want is people NOT using the policy because it is an impediment to progress.  Coming back to the idea of collaboration and understanding your user’s workflow, you want to make your user part of defining the policy.

Build vs Buy

Build versus buy needs to be balanced by the start-up costs and learning curve of using COTS solution versus the long term costs of maintaining custom solutions.

As time goes on, the maintenance costs for custom solutions can become prohibitive, especially as the amount of data (and required features) grows and/or the principles who developed the tools move on.

Open source tools have a lot of appeal as they offer some level of off-the-shelf capabilities and community support with the ability to customize if needed.  People in the bioinformatics space are often not comfortable with proprietary algorithms/solutions and want to be able to see what’s under the covers.

Other Interesting Nuggets

Other interesting things that came up are…

Periodic technical audits of your tools, processes, etc.  Having someone from the outside come in and take a look at how you are doing things.  Are you behind the curve, have you become numb to certain pain points that can be fixed.  Finding someone to do this who you trust and doesn’t have a vested interest in selling you a particular solution may be a challenge, but I found this to be an intriguing idea.

What sample tracking systems are available that manage secondary samples.  For example if you do extractions or build a library from a base sample, how do you trace that relationship back to the parent sample.

As companies move towards a regulated space, tools that provide an audit trail are very appealing to IT groups.

In summary it was an interesting discussion and gave me a lot to think about.

Head in the Cloud with Windows Azure

Microsoft hosted a developer conference in Boston today.  I think for the most part, you get when you pay for, and the $99 price of this event told me it would probably be more marketing than technology.  But it’s always good to get out and hear what’s going on and talk with other developers.

There were several topics of interest to me at the conference, ranging from F#, Silverlight, Visual Studio Team System 2010, and ASP.NET 4.0 , but one of the things I was most interested in learning about is Azure, Microsoft’s foray into cloud computing.  I thought that this conference would give me a good 50,000 foot view into Microsoft’s plans for a cloud computing paltform.

In the keynote, Amanda Silver referenced the Battle of the Currents, where in the early days of electrical distribution Thomas Edison’s system of direct current (DC) was pitted against Nikola Tesla and George Westinghouse’s system of alternating current (AC).  One of the disadvantages of direct current was that the power generation had to be close by due to power loss associated with transmission.  This meant that a manufacturing plant might need to have its own electrical plant, with all the associated capital costs and maintenance. This is much the same as technology companies today incurring the capital cost and IT support of maintaining their own data centers.  The ability to buy electric power from a utility company would allow the consumer to focus on their business and treat the incoming power as a service.

Microsoft’s intention in this market seems to be to offer a similar utility model that would have the benefits of scalability, redundancy, and IT support and allow a company that subscribes to the Microsoft data center services to focus on it’s own business domain.  As any of us who have had internal data centers know, power outages, scaling, and IT support (security patches, etc.) can be a real headache for a developer and data center IT support keeps us from doing our real jobs: design and writing code.

Now before you think I drank too much Microsoft Kool-Aid, I am just saying that is the idea behind cloud computing in general.  Why absorb the capital outlay and support costs of setting up and maintaining a data center when you can lease those capabilities?  Theoretically, this could lead to better support, scalability, and fault tolerance.  Microsoft seems to be positioning itself for proving data center services and support in the future.  How far off in the future is a question that remains to be answered.

The presentations on Windows Azure presented a fairly realistic picture of where the technology is today, and I give the presenters credit for that.  Michael Stiefel’s presentation was good and gave the 50,000 foot view that I was looking for.  He also drilled down a little bit into the services provided in Azure, including  high level.  Ben Day’s presentation was particularly good I thought, pointing out the potential of the technology while balancing that against some of the limitations of the current implementation, particularly related to Azure data storage.  I would assume their blogs will have the presentations at some point in the future, so you may want to check in.

The presentations and keynote showed how you can get started with Azure today using Visual Studio 2008.  You will need to install the Azure SDK and Azure Tools for Visual Studio that you can find here. Azure applications can execute locally on a Development Fabric, which is a simulated cloud environment on your desktop. You can also deploy a service to run in the Azure cloud, but you need to set up an account for that, and for the purposes of learning about the technology it would seem that the Development Fabric is adequate.  The developer “experience” (another big buzzword at the conference) is the same as developing other Windows apps in Visual Studio.  You can debug applications normally if they are running in the Development Fabric, however once they are deployed the only debug mechanism is logging statements.

This is definitely a “down the road” technology, and there are several kinks to work out, but if you want to be ahead of the curve it might not be a bad idea to try some of it out.  One of the things that came up in the presentations is that Microsoft is on the fence on some aspects of the implementation and will be looking to the developer community for feedback.  We’ll have to wait and see how all this plays out, but I am certainly willing to give it a spin.