Stop Press: Publishing Stopped With NullPointerException
“Stop Press” or “Stop The Presses” is a term used in the print media industry, it is used when the need to change content arises during or just before or during printing to stop the printing press. This unfortunately places publishing of all content on hold until it is resolved. With the digital media age, this term has somewhat become obsolete as more and more content is consumed, published and delivered digitally typically through Content Management Systems such as Oracle’s WebCenter Sites.
Though the need to change content does not stop publishing anymore, there are times when publishing fails and we have a “Stop Press” situation once again. Over time I have seen publishing failures for various reasons. In some cases it was due to infrastructure errors, such as network connectivity, and in other cases it was due to data errors, such as missing dependencies. I have also seen situations in which Oracle WebCenter Sites's ability to schedule automated publishing was used and the event that published the queue was not working, but the publishing mechanism was fine so manual publishes were working.
Using a recent incident as an example, I will walk through steps that were taken to resolve that particular publishing problem to provide a possible approach to diagnose this type of issue. I started by looking at the publishing history and logs. I could not find a trace of the publish occuring. Typically even when there is a network connectivity issue, there is still an error in the log to indicate that there was an attempt. Via the Admin UI, for a sanity check, I validated that the destination server was up and also confirmed that automated publishing (every 5 minutes in this case) was set up.
The above led me to believe that perhaps the event had failed. To validate this, I attempted to execute a manual publish, where I got an error -100 and still no information in the logs. Looking at the publishing history again I found no trace of my attempt, the publishing mechanism was not even triggering. I attempted to edit the publishing destination to change the server host to point to a different server to mitigate any individual server related issues, and found that I was unable to save the publishing destination. Looking at the logs I found the following error:
ERROR com.fatwire.logging.cs.xcelerate - [pagename=OpenMarket/Xcelerate/Actions/PublishConsolePost] Error: com.openmarket.xcelerate.commands.PubTargetManagerDispatcher: CanEdit: java.lang.NullPointerException java.lang.NullPointerException at com.openmarket.xcelerate.publish.PubSessionManager.IsPublishRunning(y:1232) at com.openmarket.xcelerate.publish.PubTargetManager.canEdit(y:1854) at com.openmarket.xcelerate.commands.PubTargetManagerDispatcher.CanEdit(y:3032) at sun.reflect.GeneratedMethodAccessor535.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.openmarket.framework.commands.Dispatcher.Execute(y:136) at COM.FutureTense.XML.Template.bA.B(y:49) at COM.FutureTense.XML.Template.H.A(y:3173) at COM.FutureTense.XML.Template.H.B(y:1888) at COM.FutureTense.XML.Template.GA.B(y:3058) at COM.FutureTense.XML.Template.O.B(y:3201) at COM.FutureTense.XML.Template.H.A(y:3173) at COM.FutureTense.XML.Template.H.B(y:1888) at COM.FutureTense.XML.Template.D.B(y:2969) at COM.FutureTense.XML.Template.ZA.C(y:1744) at COM.FutureTense.Common.i.A(y:472) at COM.FutureTense.Common.i.evalTemplate(y:3246) at COM.FutureTense.Common.i.processElement(y:1054) at COM.FutureTense.XML.Template.$A.B(y:1598) at COM.FutureTense.XML.Template.H.A(y:3173) at COM.FutureTense.XML.Template.H.B(y:1888) at COM.FutureTense.XML.Template.GA.B(y:3058) at COM.FutureTense.XML.Template.O.B(y:3201) at COM.FutureTense.XML.Template.H.A(y:3173) at COM.FutureTense.XML.Template.H.B(y:1888) at COM.FutureTense.XML.Template.D.B(y:2969) at COM.FutureTense.XML.Template.ZA.C(y:1744) at COM.FutureTense.Common.i.A(y:472) at COM.FutureTense.Common.i.evalTemplate(y:3246) at COM.FutureTense.Common.i.processElement(y:1054) at COM.FutureTense.XML.Template.$A.B(y:1598) at COM.FutureTense.XML.Template.H.A(y:3173) at COM.FutureTense.XML.Template.H.B(y:1888) at COM.FutureTense.XML.Template.GA.B(y:3058) at COM.FutureTense.XML.Template.O.B(y:3201) at COM.FutureTense.XML.Template.H.A(y:3173) at COM.FutureTense.XML.Template.H.B(y:1888) at COM.FutureTense.XML.Template.bA.B(y:2856) at COM.FutureTense.XML.Template.H.A(y:3173) at COM.FutureTense.XML.Template.H.A(y:3011) at COM.FutureTense.XML.Template.H.A(y:3011) at COM.FutureTense.XML.Template.H.B(y:1888) at COM.FutureTense.XML.Template.GA.B(y:3058) at COM.FutureTense.XML.Template.O.B(y:2207) at COM.FutureTense.XML.Template.H.A(y:3173) at COM.FutureTense.XML.Template.H.B(y:1888) at COM.FutureTense.XML.Template.D.B(y:2969) at COM.FutureTense.XML.Template.ZA.C(y:1744) at COM.FutureTense.Common.i.A(y:472) at COM.FutureTense.Common.i.evalTemplate(y:3246) at COM.FutureTense.Common.i.A(y:3467) at COM.FutureTense.Common.i.evalPage(y:3048) at COM.FutureTense.Common.i.execute(y:3196) at COM.FutureTense.Servlet.FTServlet.execute(y:1088) at COM.FutureTense.Servlet.FTServlet.doPost(y:3299) at javax.servlet.http.HttpServlet.service(HttpServlet.java:637) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at com.fatwire.wem.sso.cas.filter.CASFilter.doFilter(CASFilter.java:219) at com.fatwire.wem.sso.SSOFilter.doFilter(SSOFilter.java:45) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:206) at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:179) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:567) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at java.lang.Thread.run(Thread.java:662)
Looking at the error I figured out what was happening: a new publish cannot be triggered when an existing publish is running. Though according to the publishing console there were no active publishing sessions running, for some reason the application was under the impression that there was one running. This led me to checking the database for data corruption.
In the database there is a table called PubSession. This table contains a record of all publishes that have occurred or are occurring. This table contains a bunch of information about each session, such as the user that triggered it, the start time, end time, and status. This information is used to generate the Publishing History console.
I queried this table using the SQL SELECT * from PubSession order by CS_SESSIONDATE desc and found that the status for every record was either ‘D’ or ‘F’, with the exception of one and that one had a status of ‘I’. I looked for the session with status ‘I’ in the Publishing History console and could not find it. This was the corrupt record. Once that record was deleted from the table publishing was working once again.
In this case, it was difficult to identify the root cause of the issue, however after some investigation we discovered that the server lost connection to the database in the middle of triggering a publishing session. This was one of those rare occasions where something happened just at the right time and right way for this to occur. Often these situations cannot be avoided, however these risks can be mitigated by:
1. Having dedicated servers for publishing to minimize strain on the servers that are also being used to manage or deliver content.
2. Reducing the risk of the database being a single point of failure by ensuring that your infrastructure is clustered and built for failover.
3. Defining a process for maintenance and ensure it is communicated to the appropriate parties. When maintainence on the database and or shared file system is being done that may cause it to go down or offline, WebCenter Sites should be shut down and an outage scheduled to prevent data corruption.
In the digital age printing has been replaced with the more convienent publishing. Over the last 15 years or so, WCS's publishing capabilities has matured into the robust mechanism we have today, however it can occassionaly run into problems. When looking into these issues it is important to understand that there is a lot happening when a publish takes place and there are a number of external factors that can impact publishing such as data issues, filesystem permissions and network performance. It is vital to take a holistic view when resolving publishing issues.
- Log in to post comments