In this guide we’re going to cover the top 10 Java performance problems that you’re likely to face and show you how to solve them to prevent your applications from slowing down. These common performance issues are split in to five broad categories:
Author: Vinod Mohan | eG Innovations
Java is one of the most popular technologies for application development. Tens of thousands of enterprise applications are powered by Java and millions of people use them daily. Java has been evolving over many decades and there are so many web frameworks, middleware, data access technologies and protocols built on Java. Compared to C, C++, and other languages where memory management is mostly done manually by the programmers, Java is self-regulating and manages memory (free-up and reclamation) on its own, automatically.
Despite this, performance problems can also occur in Java-based applications and when a problem happens, it can be business-impacting. In this blog, we will look at some popular problems that Java developers and administrators encounter and recommend some best practices to resolve and prevent them.
Top 10 Common Java Performance Problems: |
|
---|---|
Memory | |
Threads | |
Database | |
Application / Code | |
Infrastructure |
The dreaded java.lang.OutOfMemoryError is an indication that the application is attempting to add more data to the memory, but there is no additional room for it. Out-of-memory errors result in failures that the application cannot recover from and hence, must be avoided at all cost.
There can be many reasons why an out-of-memory error occurs:
Under-provisioned memory: First, the configured heap memory in the JVM may not be not sufficient for the application. The application may attempt to put more data into the heap, but there is no more room for it. Consider the case of an application attempting to read and store a 256 MB file in memory. The JVM needs to be configured with a heap size of at least 256 MB for this to work. While specifying adequate heap memory for the JVM is important, it is equally important to ensure that the other memory spaces used by the JVM also have sufficient memory. For instance, the Oracle JVM has multiple memory spaces:
Each of these memory spaces has space usage limits that can be individually set. When any of these memory spaces is fully utilized, application errors will occur.
Spike in incoming traffic: Second, a spike in application load can trigger an out-of-memory exception. Consider a load balanced server cluster where each of the JVMs is configured to handle its normal load. When one of the nodes goes down, the other node will need to handle the additional workload. When the memory configured in the JVM in not sufficient to handle the increased workload, out-of-memory exceptions will occur.
Programming error: Third, a memory leak in the application can be caused by a programming error. The Java garbage collector is designed to reclaim the memory consumed by unused objects. But if a program keeps adding memory to the heap (e.g., a continuously growing hash table), an out-of-memory error is inevitable.
Helpful Troubleshooting Tips:
Garbage collection (GC) is a very useful process in the JVM that frees up room to add new data in memory. As much as it is useful, it can also turn undesirable if it happens too often. When garbage collection runs, it can hog CPU, the JVM’s processing may be paused and this may choke the performance of the application. The Oracle JVM supports different garbage collection algorithms: serial collector, parallel collector, the concurrent mark sweep (CMS) collector and the Garbage-first collector (G1GC). The choice of the garbage collector can have an impact on performance.
Configuring your JVM’s memory to be too large can also be detrimental to performance. In such a case, garbage collection can take a very long time to complete, affecting performance.
Helpful Troubleshooting Tips:
While caching is an essential process for faster reading of data in-memory (as opposed to making a database call across the network), it is counter intuitive to allocate excessive memory for caching. Sub-optimal memory configuration for caching will lead to more GC pauses and subsequently affect application processing.
Misconfiguration in caching will also lead to problems. Cached objects are stateful in nature, unlike pools that have stateless objects. When caching is not properly set, a recently used object could be mistakenly removed from the cache, to make room for a new object, resulting in a “cache miss” scenario when that object is being called. In addition to memory configuration, cache hit and cache miss configurations are also vital to set properly.
Helpful Troubleshooting Tips:
Java applications, especially web-based applications are often multi-threaded. Multi-threading helps with scalability, but at the same time, when multiple threads need to access shared JVM resources (often memory), locking is used to ensure that access to the shared resources is exclusively provided to each thread. When one thread locks a resource, other threads wait for the lock to be released. The Java programming language makes it easy for developers to implement synchronization between threads. The synchronized keyword can be used to create a block of code that is synchronized. Methods can also be synchronized.
Because it is so easy to create synchronized blocks, many a times developers create synchronized blocks without understanding the performance implication of such code blocks. When hundreds of threads synchronize on the same lock, the Java application’s processing of requests is severely affected and users will experience excessive slowness. When such a situation happens in production and there are hundreds of threads running in the JVM, it is very difficult to determine which lock caused the slowness and which block of code is the culprit.
Another issue with thread locking is deadlocks. For example, thread A which has an object lock waits for the execution of thread B, while thread B has a lock of its own and waits for the execution of thread A. Now, both these threads are deadlocked and will never execute causing application hangs or crashes.
Too much synchronization also takes a toll on performance. By over-synchronizing threads, one could end up facing the problem of a thread gridlock, where many threads are using the same lock, and waiting until the lock gets released.
Helpful Troubleshooting Tips:
Most Java applications use database servers for persistent storage of data. Connections to the database server are used to store and retrieve data. Because establishing a database connection for each request is expensive, a connection pool is often used. The connection pool has an initial setting for the number of connections that will be pre-established when the application starts. When additional connections are required, the pool is dynamically grown subject to a maximum specified limit.
If the number of connections in use reaches the maximum limit, newer requests will have to wait until processing of existing database requests is completed. It is important for developers and DBAs to have a fair estimation of the application workload and set the configuration accordingly. At the same time, specific application modules or web pages may have connection leaks – i.e., a connection is obtained from the pool, but it is not released back into the pool. Such connection leaks will ultimately result in application errors being reported to users. High connection pool usage can also occur during times when the database server has slowed down its processing. Hence, it is important to differentiate performance issues that are a result of database connection leaks as compared to ones that are a result of a database server bottleneck.
Helpful Troubleshooting Tips:
Database is an integral part of the application architecture. Performance of the application greatly depends on how fast the database responds and executes queries. According to a DZone performance monitoring survey, database problems rank second for the most likely cause of application performance problems. Not just that, but application developers wrongly get blamed for application issues when it is in fact a database query issue that should be addressed by a DBA. There are many reasons how a slow database query can affect application transaction processing:
Slow queries: When developing an application, developers are focused on getting the functionality right rather than on performance. While their database queries may be returning the correct results, these queries may not be designed optimally. For a query to be optimally designed, it must avoid full table scans at the database level. It must make use of database indexes, so the results are returned in the fastest possible manner. Getting queries to be optimally designed often requires the involvement of a database administrator. The DBA can analyze a query’s explain plan and provide recommendations on how to tune it for optimal performance. These could include redesigning the query, using existing indexes, recommendations for new indexes, addition of hints, etc.
Unused indexes: While it is good practice to have an index on every foreign key in a table, you must also bear in mind what kind of queries are being executed. An index may not be needed when you don’t use a specific column for your queries. Unused indexes will occupy space on the disk and the database need update the indexes every time when there is an insertion/deletion of records. This will slow down query processing.
Insufficient database resources: When database is running out of server resources, such as CPU, memory and disk, it will have an adverse impact of query execution.
Helpful Troubleshooting Tips:
The DZone performance monitoring survey referenced earlier cites code-level problem as the top cause of application performance issues. Most code-level issues are due to bugs in the code constructs, such as long waits, poor iteration, inefficient code algorithms, bad choice of data structures, etc. For example, iterating through a Vector with hundreds of thousands of records will be inefficient. A HashMap may be a more efficient data structure for this task. In most cases, code-level issues manifest as loops that take up CPU cycles in the JVM.
Then, there could be performance bugs in third-party frameworks use in application development. Ideally, all code-level issues should be captured by the QA team and fixed by the development team before production rollout. But this is not always the case.
Helpful Troubleshooting Tips:
The application server is a critical component of a Java application architecture. Popular Java application servers are Oracle WebLogic, IBM WebSphere, JBoss, WildFly, Tomcat, etc. A bottleneck in the application server would directly impact the business transactions and affect application performance and end-user experience. Problems in servlet execution, bean caching, queuing, JDBC connectivity, etc. will affect performance.
Transaction rollback is another issue to deal with. An application rollback is typically the result of pre-designed business logic. But a non-application rollback is a serious issue and needs to be addressed immediately. The application would throw an exception and the transaction would roll back in return, not allowing the end-user using the application to process their request. There are three types of non-application rollbacks:
Helpful Troubleshooting Tips:
All applications and supporting middleware and back end are run on enterprise servers. This could be a server operating system on a physical box, a virtual machine, or even in the cloud. A problem in the server hardware or resources will affect the performance of the application running on it. Insufficient CPU, memory, and disk, operating system errors, server hardware faults, high-CPU processes, zombie processes, corrupt services, etc. are some common problems administrators deal with. Especially when the server is virtualized, it is even more difficult to pinpoint problems. A resource contention issue at the virtualized host server can impact all the guest virtual machines and applications running on them. The same with containers and cloud-based application workloads.
Helpful Troubleshooting Tips:
Bandwidth congestion in the network, high latency and packet loss, misconfiguration in a router, DNS failure, etc. could affect the performance of applications. Usually there is a lot of finger-pointing between the application team and network team as to where the root cause of an application problem is and who needs to resolve it. When it is actually a network issue, the application team could be chasing a red herring on the server side.
Helpful Troubleshooting Tips:
It is important for developers to code efficiently and testers to catch and report issues – no doubt! The IT teams should also act proactively have the necessary performance management measures in place. Monitoring should not be an afterthought for organizations developing, hosting, implementing and using Java-based applications. Constant monitoring of user experience, application transactions, application code, and the supporting infrastructure is vital to detect and resolve problems before they become business-impacting.