Here goes the story in short -
We had to bump up Oracle memory to address buffer cache growth. So, in our Solaris database box, we increased Oracle project memory in /etc/project from 32 GB to 64 GB - we had enough physical memory in the server - 128 GB.
Then we increased SGA_MAX_SIZE from 28 GB to 36 GB and MEMORY_MAX_SIZE 28 GB to 42 GB. We did not set the target parameters as we had not been not using Automatic Memory Management (AMM) or Automatic Shared Memory Management (ASMM).
Then from the following day, we started receiving errors while running the RMAN backup jobs -
ORA-00603: ORACLE server session terminated by fatal error
ORA-27103: internal error
Solaris-AMD64 Error: 9: Bad file number
Additional information: 108
Additional information: 4456448
Also, receiving ORA-04030 error in alert_log file.
ORA-04030: out of process memory when trying to allocate 824504 bytes (pga heap,kco buffer)
ORA-27103: internal error
Solaris-AMD64 Error: 9: Bad file number
It was clear to me that some memory allocation issue we hit as auto allocation of PGA memory was failing. It's production box, and internal error, so interpretation should come from Oracle Support. We opened SR. Sent several trace and alert logs. At the beginning, they thought that it's OS resource allocation problem, then support thought it's a RMAN backup problem and possibly a bug, and suggested several alternatives. That did not help at all. Finally they forwarded the SR to Solaris OS Memory Engineers. They found something interesting for us.
After 3 weeks back and forth information with Oracle Support, we have been told that, by setting SGA_MAX_SIZE & MEMORY_MAX_SIZE we had enabled Dynamic Intimate Shared Memory (DISM) feature of Oracle Solaris. And this may not work in Solaris 10 x86 based system, also not recommended. So, we need to disable DISM by NOT setting SGA_MAX_SIZE, SGA_TARGET, MEMORY_MAX_SIZE and MEMORY_MAX_TARGET.
The steps we followed to disable DISM -
1. Setting individual SGA components (like shared pool, buffer cache, large pool etc.)
2. Create a PFILE from the SPFILE
3. Removed all the four parameters SGA_MAX_SIZE, SGA_TARGET, MEMORY_MAX_SIZE and MEMORY_MAX_TARGET from PFILE and let Oracle compute the SGA_MAX_SIZE aggregating all individual SGA components
4. Startup using the edited PFILE
5. Creating SPFILE from the edited PFILE
6. Shutdown & restart again so that SPFILE takes effect
Details of DISM could be found in the following doc - Oracle support referred this.
http://www.oracle.com/technetwork/articles/systems-hardware-architecture/using-dynamic-intimate-memory-sparc-168402.pdf
***