Howdy, CFers! We've got an incredibly frustrating situation with a CF Web Services-based API that we wrote and maintain. We had an API in place for years that was stable and working happily with Ruby, PHP, and ColdFusion clients. Then this year a .NET client came along, and we found that our web service was not interoperable with statically-typed languages due to our extensive use of structs.
We eventually realized we had to re-write the API without structs, and we've done so. It now uses scaler values, arrays, and CFCs (which get translated to SOAP complexTypes). The .NET client is happy, and we wrote proof-of-concept clients in about 6 different languages to ensure that we'd be interoperable this time around.
To our great dismay, it appears that our ColdFusion 7 servers can't serve the new API reliably. It works for about a day or so after restarting, then the clients start getting errors like:
Error: coldfusion.xml.rpc.CFCInvocationException [java.lang.ClassNotFoundException : tafkan.remote_api.pfapi.v.trunk.rsp_pf_survey_status_array]
and
java.lang.NoClassDefFoundError: tafkan/remote_api/pfapi/v/trunk/pf_unit
Restarting the CF instances is the only way to make the problem go away. A lot of time and money was put into rebuilding the API, so everyone is really at wit's end about this.
We've noticed that the WEB-INF/cfc-skeletons directories of our CF instances eventually seem to have two copies of the classes for each of the CFCs used by the API. For example:
-rw-r--r-- Feb 17 09:15 remote_api.pfapi.v.trunk.pf_datum.class
-rw-r--r-- Feb 3 12:20 tafkan.remote_api.pfapi.v.trunk.pf_datum.class
It seems like the errors are coming from a namespace or class search path problem, so we tried switching all CFC references to be fully-qualified (dot notation starting with a mapping) instead of just simple references to CFCs in the current directory. This seemed promising, but the problem came back within 24 hours.
Environment:
- ColdFusion 7,0,2,142559 with hf702-70523, 2-instance cluster
- Sun Java 1.4.2_13
- Apache 2.0.52
- Centos 4.5 32-bit
Maybe upgrading one of these venerable pieces of software would help? Maybe upgrading just AXIS?
We need help! I'm sure that there is someone out there with more CF/AXIS/SOAP experience than us that can help us get this problem resolved. Adobe support doesn't seem to be an option, as CF7 is EOL'ed and in extended-extended support (and that just for a few more days). We will pay the right person good money to help us figure this out. If you're that person, or think you might know who they are, please contact me ASAP!
Thanks for reading this mega-post! Leon
Update:
Thanks to all who've joined this discussion! Here's an update on where things stand at the moment.
The service just crapped out for the first time today. One of the cluster instances was still able to generate the WSDL, while the other instance said:
AXIS error
Sorry, something seems to have gone wrong... here are the details:
Exception - java.lang.NoClassDefFoundError: tafkan/remote_api/pfapi/v/trunk/rsp_pf_numeric_array
Both cfc-skeletons directories contain a file called tafkan.remote_api.pfapi.v.trunk.rsp_pf_numeric_array.class, and did not appear to contain the otherly-named files we've sometimes seen (remote_api.pfapi.v.trunk.rsp_pf_numeric_array.class). The files in cfc-skeletons do not appear to have been modified since the servers were started yesterday.
The uptime on both instances was about 21.5 hours. I was running without JIT (-Xint).
I've now restarted both instances. They're now running on Sun Java 1.4.2_19 (instead of _13), and JIT has been re-enabled as it clearly wasn't causing this error and was things were dramatically slower without it. I've also cleared the "save class files" check boxes.
And now, we wait again...
Update 2 The problem persists. I'm not sure what else to try at this point. Arg!
FYI, this is cross-posted at http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:60922