Programming for Scalability

Programming for Scalability

/0 Program m ing for Scala b ili, y 10.1 Introduction Providing software that lets people do their jobs is usability; providing software that lets 1...

1020KB Sizes 12 Downloads 125 Views

/0 Program m ing for Scala b ili, y

10.1

Introduction Providing software that lets people do their jobs is usability; providing software that lets 10,000 people do their jobs is scalability. The term scalability encompasses many facets of software. It means stability, reliability, and efficient use of one or more computer resources. The goal of a scalable system is that it must be available for use at all times and remain highly responsive regardless of how many people use the system. Scalability, with respect to software architectures, has also come to mean extensibility and modularity. This simply means that when a software system needs to scale upward in complexity, it does not need to be overhauled with each addition. In the following pages, you will learn about both aspects of scalability. The first half of this chapter deals with scalable architecture design. This is most largely applicable when a distributed service requires more than one server and the system-performance-to-hardware-cost ratio is of paramount importance. This is followed by some hands-on code examples of how to provide added scalability to your application, such as load balancing and efficient thread management.

10.2

Case study: The Google search engine Gaagle. corn is certainly the Internet's largest search engine. It serves 200 million requests per day and runs from more than 15,000 servers distributed worldwide. It is arguably one of the most scalable Internet services ever provided to the general public. Each server that Google uses is no more powerful than the average desktop PC. Granted, each server crashes every so often, and they are prone to 251

252

10.2 Casestudy:The Google search engine

hardware failure, but a complex software failover system is employed by Google to account for server crashes seamlessly. This means that even if a hundred servers crashed at the same time, the service would still be available and in working order. The rationale behind using a large number of bog-standard PCs rather than a few state-of-the-art servers is simple: cost per performance. It is possible to buy servers with 8 CPUs, 64-Gb memory, and 8 Tb of disk space, but these cost roughly three times the price of a rack of 88 dual-processor machines with 2-Gb memory and 80-Gb disk space. The high-end server would serve a single client four times faster than the rack of slower computers, but the rack could serve 22 times as many of concurrent users as the high-end server. That's scalability. It is not the case, however, to say that one server handles one user's request. If this were the case, each computer would have to trawl through thousands of terabytes of data looking for a search term. It would take weeks to return a single query. Instead, the servers are divided into six different groups~Web servers, document servers, index servers, spell check servers, advertisement servers, and Googlebot servers~each performing its own task. Google uses a sophisticated DNS system to select the most appropriate Web server for its visitors. This DNS system can automatically redirect visitors to the geographically closest data center. This is why, for instance, if you type www.google.com in Switzerland, you will be directed to www.google.ch, which is located in Zurich. But if you type www.google.com in California, you will be directed to their data center in Santa Clara. The DNS system also accounts for server load and may redirect to different centers in the event of high congestion. When the request arrives at the data center, it goes through a hardware load balancer that selects one from a cluster of available Web servers to handle the request. These Web servers' sole function is to prepare and serve the H T M L to the client; they do not perform the actual search. The search task is delegated to a cluster of index servers, which lie behind the Web servers. An index server cluster comprises hundreds of computers, each holding a subset (or shard) of a multiterabyte database. Many computers may hold identical subsets of the same database in case of a hardware failure on one of the index servers. The index itself is a list of correlated words and terms with a list of document IDs and a relevancy rating for each match. A document ID is a reference to a Web page or other Google-readable media (e.g., PDF, DOC). The order of results returned by the index depends on the

10.3 Replicationand redundancy

253

combined relevancy rating of the search terms and the page rank of the document ID. The page rank is a gauge of site popularity measured as a sum of the popularity of the sites linking to it. Other factors also affect page rank, such as the number of links leaving the site, the structure of internal links, and so forth. Google's document servers contain cached copies of virtually the entire World Wide Web on their hard drives. Each data center would have its own document server cluster, and each document server cluster would need to hold at least two copies of the Web, in order to provide redundancy in case of server failure. But document servers are not merely data warehouses. They also perform retrieval of the page title and keyword-in-context snippet from the document ID provided by the index servers. As the search is running, the peripheral systems also add their content to the page as the search is in progress. This includes the spell check and the advertisements. Once all elements of the page are together, the page is shipped off to the visitor, all in less than a second. Google also employs another breed of software, a spider named Googlebot. This piece of software, running on thousands of PCs simultaneously, trawls the Web continuously, completing a full round-trip in approximately one month. Googlebot requests pages in an ordered fashion, following links to a set depth, storing the content in the document servers and updating the index servers with updated document IDs, relevancy ratings, and page rank values. Another spider named Fastbot crawls the Web on a more regular basis, sometimes in less than a week. It only visits sites with a high page rank and those that are frequently updated. The Google architecture is one of the best in the world and is the pinnacle of scalability; however, for .NET developers, there is a slight twist in the tail. Google can afford to buy 15,000 servers by cutting down on licensing costs. This means that they use Linux, not Windows. Unfortunately, Linux isn't exactly home turf for .NET, but there is an open-source project called M O N O , which aims to provide a C# compiler for Linux (see www.go-

mono. corn).

10.3

Replication and redundancy Keeping a backup system ready for instant deployment is redundancy; keeping the backup system identical to the live system is replication. When dealing with a high-availability Internet-based service, it is important to keep more than one copy of critical systems. Thus, in the event of software or I Chapter 10

254

10.4

Scalablenetwork applications

hardware failure, an identical copy of the software can take the place of the failed module. Backup systems do not need to be kept on separate machines. You can use redundant hard drives using a redundant array of inexpensive disks (RAID) array. This is where the file system is stored on several physical hard disks. If one disk fails, then the other disks take over, with no loss of data. Many computers can read from a RAID array at once but only one computer can write at the same time (known as "shared nothing"). Of course, it's not just hard disks that fail. If a computer fails, another must take over in the same way. Providing redundancy among computers is the task of a load balancer, a piece of hardware or software that delegates client requests among multiple servers. In order to provide redundancy, the load balancer must be able to recognize a crashed computer or one that is unable to respond in a timely fashion. A full discussion of load balancers is included later in this chapter. Replication provides the means by which a backup system can remain identical to the live system. If replication did not occur, data on the backup system could become so out-of-date that it would be worthless if set live. Replication is built into Microsoft SQL, accessible under the replication folder in Enterprise Manager. SQL replication works by relaying update, insert, and delete statements from one server to another. Changes made while the other server is down are queued until the server goes live again.

10.4

Scalable n e t w o r k a p p l i c a t i o n s Server-side applications are often required to operate with full efficiency under extreme load. Efficiency, in this sense, relates to both the throughput of the server and the number of clients it can handle. In some cases, it is common to deny new clients to conserve resources for existing clients. The key to providing scalable network applications is to keep threading as efficient as possible. In many examples in this book, a new thread is created for each new client that connects to the server. This approach, although simple, is not ideal. The underlying management of a single thread consumes far more memory and processor time than a socket. In benchmarking tests, a simple echo server, running on a Pentium IV 1.7 GHz with 768-Mb memory, was connected to three clients: a Pentium II 233 MHz with 128-Mb memory, a Pentium II 350 MHz with 128-Mb memory, and an Itanium 733 MHz with 1-Gb memory. This semitypical arrangement demonstrated that using the approach outlined above, the

10.5

Future proofing

255

server could only serve 1,008 connections before it reached an internal thread creation limit. The maximum throughput was 2 Mbps. When a further 12,000 connections were attempted and rejected, the throughput keeled off to a mere 404 Kbps. The server, although having adequate memory and CPU time resources to handle the additional clients, was unable to because it could not create any further threads as thread creations and destructions were taking up all of the CPU resources. To better manage thread creation, a technique known as thread pooling (demonstrated later in this chapter) can be employed. When thread pooling was applied to the echo server example, the server performed somewhat better. With 12,000 client connections, the server handled each one without fail. The throughput was 1.8 Mbps, vastly outperforming the software in the previous example, which obtained only 0.4 Mbps at the same CPU load. As a further 49,000 clients connected, however, the server began to drop 0.6% of the connections. At the same time, the CPU usage reached 95% of its peak capacity. At this load, the combined throughput was 3.8 Mbps. Thread pooling unarguably provides a scalability bonus, but it is not acceptable to consume 95% of server resources just doing socket I/O, especially when other applications must also use the computer. In order to beef up the server, the threading model should be abandoned completely, in favor of I/O completion ports (see Chapter 3). This methodology uses asynchronous callbacks that are managed at the operating system level. By modifying the above example to use I/O completion ports rather than thread pools, the server once again handled 12,000 clients without fail; however, this time the throughput was an impressive 5 Mbps. When the load was pushed to 50,000 clients, the server handled these connections virtually flawlessly and maintained a healthy throughput of 4.3 Mbps. The CPU usage at this load was 65%, which could have permitted other applications to run on the same server without conflicts. In the thread-pool and completion-port models, the memory usage at 50,000 connections was more than 240 Mb, including non-paged-pool usage at more than 145 Mb. If the server had less than this available in physical memory, the result would have been substantially worse.

10.5

Future proofing Scalability can also apply to the ability of an application to evolve gracefully to meet future demands without major overhaul. When software is first I Chapter I0

256

10.6 Thread pooling

designed, the primary goal is to hit all of the customer's requirements or to meet the perceived needs of a typical end-user. After rollout of the product, it may address these requirements perfectly. Once the market demands some major change to the application, the program has to scale to meet the new demands without massive recoding. This connotation of scalability is not the focus of the chapter, but some of the following tips may help create a future-proof application:

9 Use classes instead of basic types for variables that represent elements within your software that may grow in complexity. This ensures that functions accept these variables because parameters will not need to be changed as dramatically in the future. 9 Keep culture-specific strings in a resource file; if the software is ever localized for a different language, this will reduce the change impact. 9 Keep abreast of modern technologies. It may soon be a requirement of network applications to be IPv6 compliant. 9 Provide a means to update your software automatically post deployment.

The key to architectural scalability is to make everything configurable and to assume nothing of the deployment environment.

10.6

T h r e a d pooling Every computer has a limit to the number of threads it can process at one time. Depending on the resources consumed by each thread, this number could be quite low. When given the choice either to guarantee your software to handle a set number of clients or to "max out" the computer's resources and risk a system crash, choose the first option: thread pooling. Threads can improve the responsiveness of applications, where each thread consumes less than 100% processor time. Multitasking operating systems share the available CPU resources among the running threads by quickly switching between them to give the impression that they are all running in parallel. This switching, which may occur up to 60 times per second, incurs some small switching cost, which can become prohibitive if the number of threads becomes too large. Threads that are blocked waiting for some event do not, however, consume CPU resources while they wait,

10.6

Thread pooling

257

but they still consume some kernel memory resources. The optimum number of threads for any given application is system dependent. A thread pool is useful at finding this optimum number of threads to use. To give some perspective on the effect of unpooled threading, examine the code below:

C# public

static v o i d IncrementThread()

{ while( true )

{ myIncrementor++; long ticks = D a t e T i m e . N o w . T i c k s lock

-- startTime.Ticks;

(this)

{ i b l I P S . T e x t = "Increments per second:" (myIncrementor

/ ticks)

+

* i0000000;

}

VB.NET Public S h a r e d Sub IncrementThread() D i m ticks as long Do MyIncrementor

= MyIncrementor+l

Ticks = D a t e T i m e . N o w . T i c k s

-- s t a r t T i m e . T i c k s

SyncLock(me) iblIPS.Text

= "Increments per second:"

(myIncrementor

/ ticks)

+

* i0000000

End s y n c l o c k Loop End Sub

This code adds one to a public variable named MyIncrementor. It then takes an accurate reading of system time, before updating the screen to show the level of increments per second. The SyncLock or Lock statement is used to ensure that no two threads attempt to update the screen at the same time because this causes unpredictable results. The results shown onscreen should not be used as a measure of how quickly the computer can I Chapter 10

258

10.6

Thread pooling

perform subtraction because most of the processor time is actually spent showing the results! When this thread was instantiated on its own, it operated at a speed of 255 increments per second; however, when this thread was instantiated 1,000 times and ran concurrently, the threads consumed more than 60 Mb of memory stack flame, which on some older computers would go directly to a paging file on disk, creating a systemwide loss of performance. In a group of 1,000 threads, the overall performance was a mere 98 increments per second, meaning that a single thread could take more than 10 seconds to iterate through one while loop. The test machine was a 555 MHz Pentium III with 128 Mb of RAM. With a thread pool, the optimal number of threads on this particular computer was found to be 25, which gave an overall operating speed of 402 increments per second, with a slightly modified IncrementerThread( ) routine.

10.6.1

I m p l e m e n t i n g a t h r e a d pool Thread pools are used constantly in servers, where a reliable service must be provided regardless of load. This sample application is a simply a benchmarking utility, but with experimentation it could be adapted for any purpose. Create a new project in Visual Studio .NET and drop in two labels: lblThreads and lblZpS. The thread pool will be populated with threads as soon as the form loads. The exact time at which the form starts is stored in a public variable named s t a r t T ~ e . Every thread then adds one to a public variable named myxnerea~entor, which helps gauge overall performance. Both of these are included in the code directly after the class declaration:

C# public class Forml : System.Windows. Forms. Form

{ public

double myIncrementor;

public

DateTime startTime;

VB.NET Public Class Forml Inherits System. Windows. Forms. Form Public myIncrementor As Double Public startTime As DateTime

10.6

Thread pooling

259

To populate the thread pool, a check is made to see how many threads should run together concurrently. That number of threads is then added to the thread pool. There is no problem in adding more than the recommended number of threads to the pool because the surplus threads will not execute until another thread has finished. In this case, the threads run in an infinite loop; therefore, no surplus threads would ever execute. Double-click on the form and add the following code:

C# private void Forml_Load(object

sender,

System.EventArgs

e)

{ int w o r k e r T h r e a d s = 0 ; int IOThreads=0; ThreadPool. G e t M a x T h r e a d s (out w o r k e r T h r e a d s , out IOThreads ) ; iblThreads.Text for

= "Threads:

" + workerThreads;

(int t h r e a d s = 0 ;t h r e a d s < w o r k e r T h r e a d s ; threads++)

{ ThreadPool. Q u e u e U s e r W o r k I t e m ( n e w W a i t C a l l b a c k (Increment ), this ) ;

} s t a r t T i m e = DateTime.Now;

}

VB.NET Private

Sub F o r m l _ L o a d ( B y V a l

sender As Object,

_

ByVal e As S y s t e m . E v e n t A r g s ) Dim workerThreads

As Integer = 0

D i m IOThreads As Integer = 0 ThreadPool.GetMaxThreads(workerThreads, iblThreads.Text

= "Threads:

IOThreads)

" & workerThreads

Dim threads As Integer = 0 For threads = 1 To w o r k e r T h r e a d s ThreadPool.QueueUserWorkItem(New (AddressOf

Increment),

WaitCallback

Me)

Next startTime = DateTime.Now End Sub

I Chapter 10

260

10.6 Thread pooling

This code first obtains the default number of threads that can run concurrently on the local machine using the C-e~axThreads method. It then displays this value on-screen before creating and running these threads. There can only be one thread pool in an application, so only static methods are called on the thread pool. The most important method is OueueuserWorkItem. The first parameter of this method is the function (delegate) to be called, and the second parameter (which is optional) is the object that is to be passed to the new thread. The Increment function is then implemented thus: C~ public void

Increment()

{ while(true)

{ myIncrementor++;

long t i c k s = D a t e T i m e . N o w . T i c k s lock

- startTime.Ticks;

(this)

{ iblIPS.Text

= "Increments

(myIncrementor/ticks)

per second: "+

* 10000000;

} }

VB.NET Public

Sub I n c r e m e n t ( )

D i m t i c k s As Long Do myIncrementor ticks

= myIncrementor

= DateTime.Now.Ticks

SyncLock

(Me)

iblIPS.Text

= "Increments

(myIncrementor

End S y n c L o c k Loop End Sub

+ 1

- startTime.Ticks

/ ticks)

per second:" & _

* I0000000

I 0.7

Avoiding deadlocks

261

Figure 10.1

Threadpool sample application.

The lock (or syncLock) is required for application stability. If two threads repeatedly access the same user interface element at the same time, the application's UI becomes unresponsive. Finally, the threading namespace is required:

using System. Threading;

VB.NET imports System. Threading To test the application, run it from Visual Studio .NET and wait for a minute or two for the increments-per-second value to settle on a number (Figure 10.1). You can experiment with this application and see how performance increases and decreases under certain conditions, such as running several applications or running with low memory.

10.7

Avoiding deadlocks Deadlocks are the computing equivalent of a Catch-22 situation. Imagine an application that retrieves data from a Web site and stores it in a database. Users can use this application to query from either the database or the Web site. These three tasks would be implemented as separate threads, and for whatever reason, no two threads can access the Web site or the database at the same time.

The first thread would be: 9 Wait for access to the Web site. 9 Restrict other threads' access to the Web site. 9 Wait for access to the database. I Chapter 10

262

10.8 Load balancing

9 Restrict other threads' access to the database. 9 Draw down the data, and write it to the database. 9 Relinquish the restriction on the database and Web site.

The second thread would be: 9 Wait for access to the database. 9 Restrict other threads' access to the database. 9 Read from the database. 9 Execute thread three, and wait for its completion. 9 Relinquish the restriction on the database.

The third thread would be: 9 Wait for access to the Web site. 9 Restrict other threads' access to the Web site. 9 Read from the Web site. 9 Relinquish the restriction on the Web site.

Any thread running on its own will complete without any errors; however, if thread 2 is at the point of reading from the database, while thread 1 is waiting for access to the database, the threads will hang. Thread 3 will never complete because thread 1 will never get access to the database until thread 2 is satisfied that thread 3 is complete. A deadlock could have been avoided by relinquishing the database restriction before executing thread 3, or in several different ways, but the problem with deadlocks is spotting them and redesigning the threading structure to avoid the bug.

10.8

Load balancing Load balancing is a means of dividing workload among multiple servers by forwarding only a percentage of requests to each server. The simplest way of doing this is DNS round-robin, which is where a DNS server contains multiple entries for the same IP address. So when a client requests a DNS, it will receive one of a number of IP addresses to connect to. This

10.8

Load balancing

263

approach has one major drawback in that if one of your servers crashes, 50% of your clients will receive no data. The same effect can be achieved on the client side, where the application will connect to an alternative IP address if one server fails to return data. O f course, this would be a nightmare scenario if you deploye a thousand kiosks, only to find a week later that your service provider had gone bust and you were issued new IP addresses. If you work by DNS names, you will have to wait 24 hours for the propagation to take place. Computers can change their IP addresses by themselves, by simply returning a different response when they receive an ARP request. There is no programmatic control over the ARP table in Windows computers, but you can use specially designed load-balancing software, such as Microsoft Network Load Balancing Service (NLBS), which ships with the Windows 2000 advanced server. This allows many computers to operate from the same IP address. By way of checking the status of services such as IIS on each computer in a cluster, every other computer can elect to exclude that computer from the cluster until it fixes itself, or a technician does so. The computers do not actually use the same IP address; in truth, the IP addresses are interchanged to create the same effect. NLBS is suitable for small clusters of four or five servers, but for highend server farms from between 10 and 8,000 computers, the ideal solution is a hardware virtual server, such as Cisco's Local Director. This machine sits between the router and the server farm. All requests to it are fed directly to one of the 8,000 computers sitting behind it, provided that that server is listening on port 80. None of the above solutions~DNS round-robin, Cisco Local Director, or Microsoft N L B S ~ c a n provide the flexibility of custom load balancing. NLBS, for instance, routes requests only on the basis of a percentage of the client requests they will receive. So if you have multiple servers with different hardware configurations, it's your responsibility to estimate each system's performance compared to the others. Therefore, if you wanted to route a percentage of requests based on actual server CPU usage, you couldn't achieve this with NLBS alone. There are two ways of providing custom load balancing, either through hardware or software. A hardware solution can be achieved with a little imagination and a router. Most routers are configurable via a Web interface or serial connection. Therefore, a computer can configure its own router either through an RS232 connection (briefly described in Chapter 4) or by using HTTP. Each computer can periodically connect to the router and set up port forwarding so that incoming requests come to it rather than the I Chapter I0

264

I 0.$

Load balancing

other machine. The hardware characteristics of the router may determine how quickly port forwarding can be switched between computers and how requests are handled during settings changes. This method may require some experimentation, but it could be a cheap solution to load balancing, or at least to graceful failover. Custom software load balancers are applicable in systems where the time to process each client request is substantially greater than the time to move the data across the network. For these systems, it is worth considering using a second server to share the processing load. You could program the clients to connect to switch intermittently between servers, but this may not always be possible if the client software is already deployed. A software load balancer would inevitably incur an overhead, which in some cases could be more than the time saved by relieving server load. Therefore, this solution may not be ideal in all situations. This implementation of a software load balancer behaves a little like a proxy server. It accepts requests from the Internet and relays them to a server of its choosing. The relayed requests must have their HOST header changed to reflect the new target. Otherwise, the server may reject the request. The load balancer can relay requests based on any criteria, such as server CPU load, memory usage, or any other factor. It could also be used to control failover, where if one server fails, the load balancer could automatically redirect traffic to the remaining operational servers. In this case, a simple round-robin approach is used. The example program balances load among three mirrored H T T P servers: uk.php.net, ca.php.net, and ca2.php.net. Requests from users are directed initially to the load-balancing server and are then channeled to one of these servers, with the response returned to the user. Note that this approach does not take advantage of any geographic proximity the user may have to the Web servers because all traffic is channeled through the load balancer. To create this application, start a new project in Microsoft Visual Studio .NET. Draw a textbox on the form, named tbStatus. It should be set with multiline to true. Add two public variables at the top of the F O r m class as shown. The p o r t variable is used to hold the TCP port on which the load balancer will listen. The s i t e variable is used to hold a number indicating the next available Web server. C# public class Forml

: System.Windows. Forms. Form

10.8

Load balancing

265

public int port; public int site;

VB.NET Public Class Forml Inherits System. Windows. Forms. Form Public port As Integer Public Shadows site As Integer

When the application starts, it will immediately run a thread that will wait indefinitely for external TCP connections. This code is placed into the form's Load event: C~ private void Forml_Load(object sender, System.EventArgs e)

{ Thread thread = new Thread(new ThreadStart(ListenerThread)); thread.Start();

}

VB.NET Private Sub Forml_Load(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles MyBase.Load Dim t h r e a d A s Thread = New Thread(New ThreadStart( _ AddressOf ListenerThread)) thread.Start() End Sub

The ListenerThread works by listening on port 8889 and waiting on connections. When it receives a connection, it instantiates a new instance of the WebVroxy class and starts its run method in a new thread. It sets the class's clientSocket and UserInterface properties so that the WebProxy instance can reference the form and the socket containing the client request.

C~ public void ListenerThread ( )

{ I Chapter I0

266

10.8 Load balancing

port = 8889; TcpListener

tcplistener

= new TcpListener(port);

reportMessage("Listening

on port

" + port);

tcplistener.Start(); while(true)

{ WebProxywebproxy

= new WebProxy();

webproxy.UserInterface webproxy.clientSocket reportMessage("New

= this; = tcplistener.AcceptSocket();

client");

Thread thread = new Thread(new

ThreadStart(webproxy.run));

thread.Start();

}

VB.NET Public Sub ListenerThread() port = 8889 Dim t c p l i s t e n e r A s

TcpListener

reportMessage("Listening

= New TcpListener(port)

on port

" + port.ToString())

tcplistener.Start() Do Dim w e b p r o x y A s

WebProxy

webproxy.UserInterface webproxy.clientSocket reportMessage("New

= New WebProxy

= Me = tcplistener.AcceptSocket()

client")

Dim thread As Thread = New Thread(New ThreadStart(

_

AddressOf webproxy.run)) thread.Start() Loop End Sub

A utility function that is used throughout the application is repor~Message. Its function is to display messages in the textbox and scroll the textbox automatically, so that the user can see the newest messages as they arrive.

C# public void reportMessage (string msg)

{ lock (this )

10.8

Load balancing

267

tbStatus.Text

+= msg + "\r\n" ;

tbStatus. SelectionStart

= tbStatus. Text. Length;

tbStatus. ScrollToCaret ( ) ;

VB.NET Public Sub reportMessage(ByVal

msg As String)

SyncLock Me tbStatus.Text

+=msg

+ vbCrLf

tbStatus.SelectionStart

= tbStatus.Text.Length

tbStatus.ScrollToCaret() End SyncLock End Sub

The core algorithm of the load b~ancer is held in the getMirror function. This method simply returns a URL based on the site variable. More

complex load-balancing techniques could be implemented within this function if required.

public string getMirror()

{ string Mirror = ""; switch(site)

{ case 0: Mirror="uk.php.net"; site++; break; case i: Mirror="ca.php.net"; site++; break; case 2: Mirror="ca2.php.net"; site=0; break;

} return Mirror;

} I Chapter I0

268

10.8 Load balancing

VB.NET Public Function Dim MirrorAs

getMirror()

As S t r i n g

S t r i n g = ""

S e l e c t C a s e site

Case 0 Mirror = "uk.php.net" site = site + 1

Case 1 Mirror = "ca.php.net" site = site + 1

Case 2 Mirror = "ca2.php.net" site = 0

End S e l e c t Return Mirror End F u n c t i o n

The next step is to develop the WebProxy class. This class contains two public variables and two functions. Create the class thus: C~ p u b l i c class WebProxy

{ public

Socket clientSocket;

public Forml UserInterface;

} VB.NET Public Class WebProxy Public clientSocket Public UserInterface

As S o c k e t As Forml

End Class

The entry point to the class is the run method. This method reads 1,024 (or fewer) bytes from the HTTP request. It is assumed that the HTTP request is less than 1 Kb in size, in ASCII format, and that it can be received in one Receive operation. The next step is to remove the HOST HTTP header and replace it with a HOST header pointing to the server returned by getMirror. Having done this, it passes control to relayTCP to complete the task of transferring data from user to Web server.

10.8

Load balancing

269

c~ public void run()

{ string sURL = U s e r I n t e r f a c e . g e t M i r r o r ( ) ; byte[]

readIn = new byte[1024];

int bytes = c l i e n t S o c k e t . R e c e i v e ( r e a d I n ) ; string c l i e n t m e s s a g e clientmessage int p o s H o s t

= Encoding.ASCII.GetString(readIn);

= clientmessage.Substring(0,bytes);

= clientmessage.IndexOf("Host:");

int p o s E n d O f L i n e clientmessage

= clientmessage. I n d e x O f ( " \ r \ n " , p o s H o s t ) ;

=

clientmessage.Remove(posHost,posEndOfLine-posHost); clientmessage

=

clientmessage.Insert(posHost,"Host:

"+ sURL);

readIn = E n c o d i n g . A S C I I . G e t B y t e s ( c l i e n t m e s s a g e ) ; if(bytes

== 0) return;

UserInterface.reportMessage("Connection clientSocket.RemoteEndPoint

from:"

+

+ "\r\n");

UserInterface.reportMessage ("Connecting

to Site:"

+

sURL + "\r\n");

relayTCP(sURL,80,clientmessage); clientSocket.Close();

}

VB.NET Public

Sub run()

Dim sURLAs

String = U s e r I n t e r f a c e . g e t M i r r o r ( )

Dim readIn()

As Byte = N e w Byte(1024)

{}

Dim bytes As Integer = c l i e n t S o c k e t . R e c e i v e ( r e a d I n ) Dim c l i e n t m e s s a g e

As String = _

Encoding.ASCII.GetString(readIn) clientmessage

= clientmessage.Substring(0,

bytes)

Dim p o s H o s t As Integer = c l i e n t m e s s a g e . I n d e x O f ( " H o s t : " ) Dim p o s E n d O f L i n e

As Integer = c l i e n t m e s s a g e . I n d e x O f m

(vbCrLf,

posHost)

clientmessage

= clientmessage.Remove(posHost,

posEndOfLine clientmessage "Host:

-

- posHost)

= clientmessage.Insert(posHost,

-

" + sURL)

readIn = E n c o d i n g . A S C I I . G e t B y t e s ( c l i e n t m e s s a g e ) If bytes = 0 Then R e t u r n

I Chapter I0

270

10.8 Load balancing

UserInterface.reportMessage("Connection

from:" + _

clientSocket.RemoteEndPoint.ToString()) UserInterface.reportMessage("Connecting relayTCP(sURL,

to Site:" + sURL)

80, clientmessage)

clientSocket.Close() End Sub

The data transfer takes place on relayTCP. It opens a TCP connection to the Web server on port 80 and then sends it the modified H T T P header sent from the user. Immediately after the data is sent, it goes into a loop, reading 256-byte chunks of data from the Web server and sending it back to the client. If at any point it encounters an error, or the data flow comes to an end, the loop is broken and the function returns.

C# public void relayTCP(string

host,int port,string cmd)

{ byte[] szData; byte[] RecvBytes = new byte[Byte.MaxValue]; Int32 bytes; TcpClient TcpClientSocket = new TcpClient(host,port); NetworkStreamNetStrm = TcpClientSocket.GetStream(); szData = System.Text.Encoding.ASCII.GetBytes(cmd.ToCharArray()); NetStrm.Write(szData,0,szData.Length); while(true)

{ try

{ bytes = NetStrm.Read(RecvBytes,

0,RecvBytes.Length);

clientSocket.Send(RecvBytes,bytes,SocketFlags.None); if (bytes<=0)

break;

} catch

{ UserInterface.reportMessage("Failed break;

}

connect");

10.8

Load balancing

271

VB.NET Public Sub relayTCP(ByVal host As String, ByVal port _ As Integer, ByVal c m d A s

String)

Dim szData() As Byte Dim RecvBytes() As Byte = New Byte(Byte.MaxValue)

{}

Dim bytes As Int32 Dim TcpClientSocket As TcpClient = New TcpClient(host, Dim N e t S t r m A s

port)

NetworkStream = TcpClientSocket.GetStream()

szData = m

System.Text.Encoding.ASCII.GetBytes(cmd.ToCharArray()) NetStrm.Write(szData,

0, szData.Length)

While True Try bytes = NetStrm.Read(RecvBytes, clientSocket.Send(RecvBytes,

0, RecvBytes.Length)

bytes, SocketFlags.None)

If bytes <= 0 Then Exit While Catch UserInterface.reportMessage("Failed

connect")

Exit While End Try End While End Sub

As usu~, some standard namespaces

are

added to the head of the code:

C# using System.Net; using System.Net. Sockets; using System.Text; using System. IO; using System.Threading;

VB.NET Imports System.Net Imports System.Net.Sockets Imports System.Text Imports System.IO Imports System.Threading

To test the application, run it from Visual Studio .NET, and then open a browser on http://lacalhast:8889; you will see that the Web site is loaded I Chapter I0

272

10.9 Conclusion

Figure 10.2 HTTP loadbalancing application.

from all three servers. In this case, data transfer consumes most of the site's loading time, so there would be little performance gain, but it should serve as an example (Figure 10.2).

10.9

Conclusion Scalability problems generally only start appearing once a product has rolled out into full-scale production. At this stage in the life cycle, making modifications to the software becomes a logistical nightmare. Any changes to the software will necessarily have to be backwards compatible with older versions of the product. Many software packages now include an autoupdater, which accommodates postdeployment updates; however, the best solution is to address scal-

10.9

Conclusion

273

ability issues at the design phase, rather than ending up with a dozen versions of your product and the server downtime caused by implementing updates. The next chapter deals with network performance, including techniques such as compression and multicast.

I Chapter I0