Understanding ZIMBRA Server Architecture Can Help to Size Up Your ZIMBRA Deployment Strategy
Author - Sing Koo
Zimbra Collaboration Suite (ZCS) is a powerful WEB based all-in-one Open Source system that can easily meet the needs of many growing businesses. It is not just e-mail. It is a work group solution that enables the sharing of resources over the WEB using HTTP or HTTPS protocol. Resources such as e-mail, contact management, calendaring, task management, word processing, and file sharing are built into the system. As an open source solution, anyone can download ZCS without paying licensing or maintenance fees. The only things you will need are a server, a static IP address and Internet bandwidth. Questions regarding scaling or scalability are one of the most often asked by administrators and businesses considering or planning for a ZCS deployment. How many users can Zimbra host? How many concurrent users can be handled simultaneously? How does this factor into performance? I have seen these questions asked time and again. As a result, I decided to dig into the architecture of the Zimbra Server in hopes of offering readers more insight and clarity in measuring ZCS sizing and capacity estimates.
By leveraging on Zimbra's object framework, scaling is not limited to a single server; multiple server architecture can be used to support a large number of users over a wide geographical area. The Zimbra Server runs on top of a servlet engine in a Java Virtual Machine environment (JVM). For the benefit of non-JAVA savvy folks, a JVM is a process inside an operating system running a JAVA binary (JVM) distributed by JAVA providers. The JAVA program is typically provided by SUN (Oracle), IBM, or other JVM makers. All JVM are built based on the same specifications. Any variations are transparent to the end users. These variations are in areas such as thread management, exception management, and memory management. JVM comes with a loader that loads the initial class file (a JAVA program) which contains a static "main" method. This method then instantiate objects to perform the work. A servlet engine is a JAVA program that manages objects in the form of servlets. A servlet is a kind of JAVA program that bears the property of the servlet specification as set forth by the JAVA community. Among the well-known servlet engines are Tomcat, JBOSS, GlassFish, Websphere, and Jetty. What makes one servlet engine different from the next is the additional environment that it supports. For example, JBOSS supports the JAVA Enterprise Edition (EE) environment. The J2EE environment offers functionalities for a JAVA application to use common services that become an integral part of a typical JAVA application. Common examples of these services are messaging services, persistent data services, or model view controller services.
The Zimbra Server does not use any of the J2EE services. It comes with a servlet interface that enables it to run as a servlet instance in a JVM. Although it can run in any other servlet engine, Jetty has been selected as part of its distribution. Jetty is lightweight and comes without the J2EE services. It offers thread pools for servlet instances to handle multiple user requests. Thread pools are hot reusable standby threads. Its use promotes performance since it avoids the overhead that comes with thread creation and destruction. Zimbra uses thread pools to handle all of its requests from the Web Client.
One of the most popular aspects of ZCS is its AJAX-based user interface (UI). When a user signs into their Zimbra account, AJAX is the default interface. Zimbra implements its AJAX web-client service using a combination of Representational State Transfer (REST) and Model View Controller (MVC) design patterns. The REST design frees the server from tracking the state of the web-client while the MVC design provides instantaneous updates in response to user actions. The intuitive widget and icons enable users to discover the features as well as using drag and drop to manage calendar events and emails. For example, one can use conventional window UI technique to select a group of email and drag it to a folder.
The price to pay for slick and intuitive UI is performance. This toll is calculated strictly in terms of network bandwidth, memory and processor power. The number of concurrent active users that uses the AJAX interface will be noticeably less than that of HTML (non-AJAX) users. Take the drag and drop feature as an example. The feature works best in web-clients with more memory and processing power (e.g. a configuration with dual-core CPUs and 4+ giga bytes of memory). The AJAX interface fires up more requests to the server than HTML interface. Each request is translated into a full client-server transaction life cycle. The type of recommended UI will definitely affect the number of concurrent users that can be supported by a Zimbra Server.
When a request is received by Zimbra's servlet, it is passed to a "SoapEngine" object for dispatch using a handler. The SoapEngine is the MODEL component of the MVC. The web-client serves up the VIEW component and the handlers make up the CONTROLLER component. Zimbra's MVC design makes it easy for developers to extend functional features by extending Zimbra's controller objects and utility objects. For example, if there is a need to use an alternate authentication method other than Zimbra's default LDAP, the provisioning class can be extended so that a sub-class can be used to override the account authentication method. Likewise, services for folder, mailbox, account, calendars etc. can be extended in similar manner by unleashing the power of object architecture. As long as the data objects are derived from Zimbra's data object, an alternate architecture can be created to use a different mail store or calendar manager. This object framework makes it possible for one to extend the system indefinitely and to scale it up to support a large number of users by means of multi-tier architecture.
A user request is presented in JSON or XML protocol. It is converted into a request object and a context object. The request object contains the "authtoken" among other things that provides the background data of the request. If mail messages and attachments are included as part the request, it is presented as a MIME message in RFC822 format. Calendar and events are presented in iCal messages according to RFC2445. The context object contains the details of the request including things such as the HTTP context of the request. The "SoapEngine" convert the request into handler object. Handler object are objects that are inherited from a class hierarchy derived from the DocumentHandler class. The dispatcher mechanism in the "SoapEngine" uses polymorphism to process requests and receives a response in JSON message format in return. The response message is then returned to the client-browser, which in turn interprets the response message into UI actions.
The Zimbra Server identifies a user by the account UID. An account with an active session is cached as an account object instance in the Zimbra Server. The Zimbra Server manages account data by folders. For example, the "INBOX" is a folder; calendar events are associated to a folder. Folders are identified by folder IDs. Zimbra items such as mail messages, calendar events are stored according to the corresponding folder ID. Accounts, mailbox, and folders are kept in memory for the duration of active sessions. When there are no more active sessions using an account, objects related to the account are kept in memory until there is a JVM process, called "garbage collection (GC)," that initiates the removal of these objects. The least recently reference objects are removed from cache so that memory is freed. One of the trigger for GC is the low memory condition. JVM memory management uses different strategies to free up memory. What is important to note here is the accumulation of objects for active sessions resulting in memory depletion. Since some of the object cannot be freed up while a user is in session, this architecture affects the total number of active accounts that can be handled by a Zimbra Server. Therefore the maximum number of accounts that a Zimbra Server can have is directly related to the account content and concurrent user activities. When a user is dragging the scroll bar in the mail summary screen, a request is generated to the Zimbra Server requiring it to obtain mail items from the mail store. These items take up memory space for a short duration. The time-based trigger will rapidly release these transient objects and free up memories held by these objects. When new users continue to sign-in and existing sessions remains active, there will come a time when the memory required to instantiate objects outstrips the rate of memory being recovered. When heap memory is no longer available, an "out of memory (OOM)" exception condition will cause the JVM to shut down. The Zimbra Server has a watchdog that automatically restarts the JVM for the "zmmailboxd" as soon as it crashes. Therefore, the end-users who are using the Zimbra web-client will experience a momentarily reset of the session.
Aside from the server architecture, there are other factors that may be attributed to the depletion of processor resources. The statistic collectors are scripts that run periodically and frequently to collect operational statistics. It can be easily turned off to conserve processing power when such statistic monitors can be replaced by other means such as Visual JVM or JCONSOLE. MYSQL DB is also used to accumulate statistics. It can be turned off through source code maintenance. Another facility, start and stop time log records, can be eliminated if your installation is not using the information collected in the "zmmailboxd.out" file. A combination of these strategies can add substantial resource back into the very much needed Zimbra server power.
One last thought about memory consideration with Zimbra Server is file attachments. Attachment uploads are processed in memory. That means the size of the file at one point is taking up the heap memory space inside the Zimbra Server JVM. With that in mind, one may want to consider how frequent, and how many concurrent users are attaching files with their email, or opening email attachments. The fact that it is transient means memory is released as soon as the request is completed. However, when there are too many concurrent users interacting with attachments at the same time will definitely strain the heap memory and it can cause OOM and ultimately crash the JVM. Therefore, one should be mindful to set up the file size limit so as not to over run the heap memory.
Many open source projects provide building blocks for developers to create system applications. Zimbra is an open source technology that is ready for instant deployment. As a single server, the capacity is limited. However, knowing the inner workings of Zimbra can help to fine-tune the server to handle substantially more load than the default setup. As number of users grows, multi-tier architecture can be implemented to use alternative mail stores. Zimbra's object framework enables system architects to extend it indefinitely to fit any enterprise's growing needs.