FusionAuth server completely locked up
-
Any idea what happened here? The server was completely locked up and I had to do a force reboot. After rebooting, everything is working normally again.
2023-11-21 07:09:41.030 PM ERROR io.fusionauth.app.action.BaseOAuthCallbackAction - Unable to call FusionAuth Token endpoint using code [bpzbDSTDZeu5flug8js8gpr79ZWQs6vBSFNck01cKdQ]. HTTP Status Code [400]. Error message: [{ "error" : "invalid_request", "error_description" : "Invalid Authorization Code", "error_reason" : "auth_code_not_found" }] 2023-11-22 03:59:22.067 AM ERROR io.fusionauth.http.server.HTTPServerThread - An exception was thrown during processing java.lang.NullPointerException: Cannot invoke "io.fusionauth.http.server.RequestPreambleState.store()" because "<local3>" is null at io.fusionauth.http.server.HTTPRequestProcessor.processPreambleBytes(HTTPRequestProcessor.java:95) at io.fusionauth.http.server.HTTP11Processor.read(HTTP11Processor.java:155) at io.fusionauth.http.server.HTTPS11Processor.read(HTTPS11Processor.java:148) at io.fusionauth.http.server.HTTPServerThread.read(HTTPServerThread.java:348) at io.fusionauth.http.server.HTTPServerThread.run(HTTPServerThread.java:176) 2023-11-22 06:46:49.718 AM ERROR com.inversoft.scheduler.LogAndRetainFailureHandler - The scheduled service [class com.inversoft.authentication.api.service.AuthenticationKeyCacheLoader] failed but will be re-run. org.apache.ibatis.exceptions.PersistenceException: ### Error querying database. Cause: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 155614ms. ### The error may exist in com/inversoft/authentication/api/domain/AuthenticationKeyMapper.xml ### The error may involve com.inversoft.authentication.api.domain.AuthenticationKeyMapper.retrieveAll ### The error occurred while executing a query ### Cause: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 155614ms. at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30) at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:156) at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:147) at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:142) at jdk.internal.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.apache.ibatis.session.SqlSessionManager$SqlSessionInterceptor.invoke(SqlSessionManager.java:355) at jdk.proxy2/jdk.proxy2.$Proxy53.selectList(Unknown Source) at org.apache.ibatis.session.SqlSessionManager.selectList(SqlSessionManager.java:204) at org.apache.ibatis.binding.MapperMethod.executeForMany(MapperMethod.java:147) at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:80) at org.apache.ibatis.binding.MapperProxy$PlainMethodInvoker.invoke(MapperProxy.java:142) at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:86) at jdk.proxy2/jdk.proxy2.$Proxy59.retrieveAll(Unknown Source) at com.inversoft.authentication.api.service.AuthenticationKeyCacheLoader.internalLoad(AuthenticationKeyCacheLoader.java:38) at com.inversoft.cache.BaseCacheLoader.load(BaseCacheLoader.java:21) at com.inversoft.authentication.api.service.AuthenticationKeyCacheLoader.run(AuthenticationKeyCacheLoader.java:33) at com.inversoft.scheduler.InjectableRunnable.run(InjectableRunnable.java:56) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833) Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 155614ms. at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:696) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:181) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:146) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100) at org.apache.ibatis.transaction.jdbc.JdbcTransaction.openConnection(JdbcTransaction.java:145) at org.apache.ibatis.transaction.jdbc.JdbcTransaction.getConnection(JdbcTransaction.java:67) at org.apache.ibatis.executor.BaseExecutor.getConnection(BaseExecutor.java:345) at org.apache.ibatis.executor.SimpleExecutor.prepareStatement(SimpleExecutor.java:89) at org.apache.ibatis.executor.SimpleExecutor.doQuery(SimpleExecutor.java:64) at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:333) at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:158) at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:110) at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:90) at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:154) ... 23 common frames omitted 2023-11-22 06:46:49.880 AM INFO io.fusionauth.api.service.system.NodeService - Node [951665d2-e50f-4914-8950-6b96246c7357] with address [http://10.144.65.18:8080] removed because it has not checked in for the last [213] seconds. Bye node. 2023-11-22 11:28:03.210 AM INFO io.fusionauth.api.service.system.NodeService - Node [951665d2-e50f-4914-8950-6b96246c7357] with address [http://10.144.65.18:8080] removed because it has not checked in for the last [209] seconds. Bye node. 2023-11-22 11:33:37.574 AM WARN com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m13s351ms914µs201ns). 2023-11-22 11:35:16.620 AM WARN com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=6m20s354ms981µs782ns). 2023-11-22 11:36:07.338 AM WARN com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=50s941ms257µs890ns). 2023-11-22 11:37:09.289 AM WARN com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=55s420ms783µs562ns). 2023-11-22 11:38:35.711 AM WARN com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m23s325ms602µs713ns). 2023-11-22 11:39:31.981 AM WARN com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Thread starvation or clock leap detected (housekeeper delta=1m5s637ms113µs254ns). 2023-11-22 12:07:04.616 PM INFO io.fusionauth.api.configuration.DefaultFusionAuthConfiguration - Loading FusionAuth configuration file [/usr/local/fusionauth/config/fusionauth.properties] 2023-11-22 12:07:04.703 PM INFO io.fusionauth.api.configuration.DefaultFusionAuthConfiguration - Dynamically set property [fusionauth-app.url] set to [http://10.144.65.18:8080] 2023-11-22 12:07:04.705 PM INFO com.inversoft.configuration.BasePropertiesFileInversoftConfiguration - - Overriding default value of property [database.mysql.enforce-utf8mb4] with value [true] - Overriding default value of property [fusionauth-app.runtime-mode] with value [production] - Overriding default value of property [search.type] with value [database]
-
Hi @brad ,
From here, it looks like the database was unavailable or FusionAuth ran out of connections.
@brad said in FusionAuth server completely locked up:
Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 155614ms.
This can be caused by lambdas or connectors that are too slow to respond or are blocked from responding. It also can happen if FusionAuth is under extreme load and the database is undersized.
-
@dan The database backend is AWS Aurora, and it's a big production instance, so it's unlikely that it would run out of connections. Also, other apps running on the same database continued to work fine. FusionAuth is currently in testing. Since we only have a couple of users testing it, the load is extremely light, so I don't see how it could have been under extereme load.
-
@brad Interesting. The error message
Connection is not available, request timed out after 155614ms
makes me suspicious, but I hear you.I'm not familiar with the performance characteristics of Aurora, to be honest. We definitely have some folks running FusionAuth with that as their database, but it isn't typical and is not listed as a supported database.
You could try tweaking some of the database connection pool settings mentioned here: https://fusionauth.io/docs/reference/configuration
I'd also suggest googling around to see whether anyone else has issues with Hikari (our connection pool software) and the version/engine of Aurora that you are using.
Finally, if you have a plan that includes engineering support, I'd encourage you to open a support ticket.