My team inherited an Oracle-based web application and they are fairly inexperienced with Oracle database servers.
The Oracle 10g server is running on a Windows 2003 Server with plenty of disk space and from time to time, all connectivity is lost, the application stops working, not even SQL Plus is able to connect to the database server.
But when we check the Windows Service manager, it says that the service is up and running. A restart usually fixes the problem, but we need to properly troubleshoot it so we know what's causing it and so we can avoid it to happen anymore.
Where should we start looking for clues? What are the criticial log files we should be investigating?