When I run two successive reconstructions with system matrix caching (recomputing the system matrix in both), then the second reconstruction gets killed while computing the system matrix. This new behaviour was introduced with PR 1592 (#1592). Sanitisers are not flagging any memory leaks, so the issue is most likely caused by allocating huge chunks of contiguous memory, where the second time around the memory may already be fragmented.
If fragmentation is indeed the problem, the following may be solutions:
- do not insist on contiguous memory for system matrices (which are huge: in my case around 190 GB)
- find a way to not just free the allocated memory, but actively pass it back to the operating system (apparently "malloc_trim(0)" could work for linux)
- another option is apparently to request the memory directly from the operating system, rather than with "new". On linux this would be with mmap and on Windows with VirtualAlloc. Freeing (munmap) this would immediately return the memory to the operating system.