Building applications in Azure
One of the most natural uses of the cloud is for web applications. You may already be using virtual machines on your own systems to make deploying your applications easier, either to new hardware or to additional servers. Microsoft Azure uses virtualization too, but it also brings useful benefits that virtualization cannot deliver alone. By hosting your application in the cloud, you can leverage automatic scaling, load balancing, system health monitoring, and logging. You also benefit from the fact that managed cloud platforms help narrow the attack surface of your system by automatically patching the operating system and runtimes and by keeping systems sandboxed. Let’s look at some examples of how to build some common web applications inside of Microsoft Azure.
Imagine that you work for a retailer who generates a significant amount of revenue through online sales. Imagine also that this retailer has been around for long enough that it already has an established web architecture that runs in a private data center. This retailer has decided that it wants to move to a hosted platform so that it no longer has any data center responsibilities and it can focus on its core business. How do you replatform this web application into Microsoft Azure? Let’s first identify some requirements for this system:
- It has high utilization and needs to serve a large number of con‐ current users without timing out, even during peak hours such as Black Friday sales.
- It needs to accommodate a wide variety of products in its data‐ base that do not necessarily all follow the same schema.
- It needs a fast and intelligent search bar so that customers can find products easily.
- It needs to be able to recommend products to customers as they shop to help generate additional revenue.
However these requirements are being met today in the private data center, I can suggest some guidelines on how to reproduce this system in Microsoft Azure so you can boost performance instead of just replicating it. I will take each of these requirements in order and explain how to leverage certain Azure components so that these requirements are properly met.
Requirement 1: High utilization
One reliable way to guarantee performance under high load is to separate your app into disconnected tiers that can each operate at different performance levels. This ensures that a fast and responsive web frontend is never directly waiting on a slower database-transaction-driven backend while it is serving responses to web browsers. The web tier is allowed to run fast while the database tier is allowed to run a little slower. The way you connect these tiers together is through a brokered messaging system (Figure 1-5). Both ends of this architecture, front and back, can scale independently of one another to avoid performance bottlenecks. The message broker is the key to this pattern. The frontend can send messages to the broker asynchronously, as fast as needed. The backend pulls messages from the broker as fast as it can, but its speed does not have to match that of the frontend since neither tier is waiting on the other to proceed.
To achieve this architecture in Microsoft Azure, you will need a scalable web frontend, a message broker, and a scalable backend. There are a lot of different ways to accomplish this, but there are specific options that will help you to meet your requirements better than others.
Frontend tier. Since you are migrating an existing and mature platform into the cloud, you may have built your existing solution with extra services installed on your web servers or even some third-party software packages that add functions to your system. You can still use this type of design in the cloud, but it does limit your options a bit since it means you need more control over the underlying virtual machines that are hosting your site.
For the frontend, your web server layer, I recommend building an Azure Cloud Services web role solution. Web roles, as opposed to app services, can scale both automatically and without limitation to meet any conceivable level of demand. Also, unlike virtual machines, web roles give you the benefits of Platform as a Service hosting, including automatically patched servers and runtimes to minimize your infrastructure responsibilities. Web roles can still manipulate the servers upon which they run, so you can install additional services or special configuration options as a startup task when the web role is deployed. In this way, web roles provide you with the ideal balance of custom configuration options, virtually infinite and automatic scaling, and a managed platform with automatic patching without any hardware responsibilities.
As discussed previously, a web role is simply a Windows Server that installs and starts IIS as part of its deployment process. Since web roles are a form of Platform as a Service, your role as a developer is to create the application code; Azure deploys that code to virtual machine images on its own. You don’t have any role in setting up those virtual machines, although you can customize them by adding scripts to the startup process. Visual Studio has project templates you can use to create a web role project as a normal .NET web application, but you can also use Java, Node.js, PHP, Python, or Ruby and your IDE of choice. Web role projects provide you with a role entry point class (
WebRole.cs in a .NET app) where you can place code you want to execute during the start, stop, and run events of the server instance. The Azure platform will monitor the status of your web roles automatically and will recycle the instances if the web server crashes.
Cloud services, which encompass both web roles and worker roles, are unique in that they are stateless machines. Each cloud service instance comes with a certain amount of local disk storage that you can use for temporary files, but this local storage is completely erased during the startup process. Files you need to persist should be stored in Azure Blob Storage or in Azure Files, which can be attached to your cloud services through SMB as an additional, and persistent, drive.
Additionally, cloud services do not come configured for any type of user session state management; the recommendation is that you architect your applications that run on Azure Cloud Services to be stateless, so that all of the data the application needs to process a request is included in the URL or in HTTP headers. One good reason for this is that stateless applications are better able to scale since they do not have to load session state from some central location in order to handle user requests. This is important because cloud services are automatically load balanced in a round-robin fashion so that, unless you only have one server instance running, it is very unlikely that the same user will keep hitting the same server while clicking on links in a web application. If you have to manage session state, that could potentially mean that every click requires some new server to load session state all over again, and this could really hurt performance. If you do have to use session state for your application, I recommend using a fast in-memory cache such as Redis Cache (a managed Azure service) or Azure Table Storage (a key/ value NoSQL database) to minimize the performance impact.
I also suggest storing static files, such as images, in Azure Blob Storage, for two reasons. First, the storage space is cheap and publicly accessible, and supports high throughput for web access. Second, by splitting your files across multiple domains you enable web browsers to download them in parallel, improving performance.
Backend tier. For the backend tier, I recommend that you use the other type of cloud service template: worker roles. Worker roles are managed by Azure the same as web roles, but worker roles have two important differences. First, worker roles do not come configured with IIS. You are allowed to do whatever you want with a worker role, so you could configure your own web server and use it to host web applications, but Azure would not manage this for you. Secondly, since worker roles do not have an Azure-managed web server, Azure can only tell if your application has crashed by watch‐ ing for events in the role entry point class (
WorkerRole.cs for .NET projects). For a worker role, you need to start your actual application inside of the role entry point’s Run() method and have it run in an infinite loop wrapped in a
try/catch code block. This way, only an application crash would cause the
Run() method to return, and this lets Azure know that your server instance needs to be recycled.
Message broker. To facilitate the message broker, I recommend that you use Azure Queue Storage. Queue Storage is the simplest and most cost-effective way to broker messages between application tiers, and it will satisfy the needs of this application just fine. Azure Service Bus supports more features, but this isn’t necessary here since all you need is a simple queue.
Scalable architecture. With an architecture like this, you can build a stateless web application as a scalable web role that will be the frontend of your site. This tier will read from the data layer but it will not write to it directly; it will place orders on an Azure queue for the backend tier to pick up and process. The scalable worker role back‐ end tier will have the responsibility of fulfilling orders based on transactionally consistent inventory counts and prices. This way, when the user clicks to purchase an item from the store, the web role will take payment, place the order on the queue, and then immediately tell the user to expect an email soon with a confirmation and shipping details. The worker role will be reading messages from the queue constantly. Once it receives this order message, it will process it against currently accurate inventory and then email the user with a shipping confirmation or, if the system was out of stock by that point, with a backorder confirmation telling the user when to expect the shipment. This way, even if several hundred users all click the buy button at the same time when the inventory count is 3, none of them will have to wait while the site processes the order and none of them will get the wrong message about whether or not their orders will be fulfilled right away.
As your application grows, you can either set Azure to automatically scale your front- and backend tiers onto additional server instances to meet user demand, or manually provision additional instances on your own. Either way, since you have created this application with stateless services, the different server instances will not conflict with one another as they work, even if you scale to hundreds of instances. Likewise, if demand goes down, you can scale back down to fewer instances to manage cost.
Requirement 2: Wide variety of products
There is an architectural pattern known as polyglot persistence, which is another way of saying that an application uses multiple different database types to store its data. For our retail sales web application, we can use this polyglot persistence pattern to store different types of data in specific database types that can best handle the nuances of that data.
Since our retail application stores many different types of products, storing this data in a database that requires a uniform schema across all products is somewhat problematic. A document database is ideal to help solve this problem since it stores documents, normally formatted as JSON, that can exist in any shape so long as they have a key field. If you want to store a book product, you can store it in its own natural shape; give it a page count, authors, editors, illustrators, and so on. Then you can store another product, like a laptop computer, with completely different properties, such as battery life, screen size, processor specs, and memory. Not only can the properties be unique for each document, but the properties can naturally be objects on their own. For the laptop example, the processor property could be an object with its own properties like clock speed, L2 cache, and manufacturer. Documents in this type of database exist as hierarchical structures that can contain nested objects of nearly any shape. A database like this is not suited for complicated relationships, though, so instead of joining tables (or documents) together you will want to store all of the data that a document needs together inside that document. For example, instead of storing a customer in one document and that customer’s addresses in another document, you would store the customer with an object array called addresses that contained all of the address data inside.
For our example, it still makes good sense to store product inventory and orders in a relational database that can enforce transactions and maintain a consistent state. Product information, however, con‐ forms much better to a document database, where each product’s unique properties can be stored appropriately without affecting other products.
If you already have an investment in a different document database, such as MongoDB or RavenDB, you can still use that within Microsoft Azure. Both of these, and more, are offered through external vendors in the Azure marketplace. I recommend DocumentDB over these others because it is offered as a fully managed and scalable service that can be purchased in small incremental units, it supports a SQL-like syntax, and it guarantees a specific performance level based on purchased units that can scale alongside your application.
Requirement 3: Fast search
As a developer, you know how much users depend on search engines and search functions within your own apps. Providing search functionality can take many forms. You may like to use a database option like full-text indexes with SQL queries, or you may use an indexing service like Apache Solr. Either way, I think you would agree that creating your own search service is hard work and takes up time that can be better be spent elsewhere.
For a web application like our retailer site, we want a search service that can scale without a lot of manual intervention on our part. I certainly recommend using an indexing service, as opposed to creating your own database-built solution. There are a few externally hosted options in the Azure marketplace, including Apache Solr, but there is also a new offering from Microsoft called Azure Search.
Azure Search is very new, so I suggest you weigh your options carefully, but the fact that Azure Search is a managed service that can scale elastically without the need to purchase expensive blocks of dedicated server space makes it an appealing option. Also, although it is new, it comes equipped with full-text search, weighted results, autocomplete suggestions, geospatial data, and faceted navigation. This is a rich feature set, to be sure.
Requirement 4: Product recommendations
To drive product recommendations, your best tool will be a type of NoSQL database known as a graph database. This type of database is designed to store nodes of data that are connected to one another through relationships. A database like this is optimized to traverse these relationships to quickly solve queries such as “What is the shortest route between nodes A and B?” or “Which nodes are connected in two or fewer hops that are also connected to nodes C and D?” This type of analysis is ideal for locating products that some customers have purchased alongside products that a current user may have in the shopping cart.
To make this work inside Microsoft Azure, your options are more limited than with the other components we have discussed so far. Of all the graph databases in common use today, only Neo4j is available from the Azure marketplace, and it is only offered as a virtual machine image. There is nothing wrong with this, of course, but it does mean that you will be responsible for managing the operating system and database configuration yourself since it is not a managed service.
Online retail store complete architecture
Putting this all together, you can use the Azure components we have discussed here to create an online retail store web application that can scale to meet customer demand, even under intense load; that can offer all of the advanced features the business requires; and that minimizes the requirements on you as the developer. The diagram in Figure 1-6 puts all of the component pieces together.