Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
702 views
in Technique[技术] by (71.8m points)

parallel processing - OpenMP private array - Segmentation fault: 11

When I try to parallelize my program in Fortran90 by OpenMP, I get a segmentation fault error.

    !$OMP PARALLEL DO NUM_THREADS(4) &
    !$OMP PRIVATE(numstrain, i)
    do irep = 1, nrep
        do i=1, 10
            PRINT *, numstrain(i)
        end do
    end do
    !$OMP END PARALLEL DO

I find that if I comment out "PRINT *, numstrain(i)" or remove openmp flags it works without error. I think it is because memory access conflict happens when I access numstrain(i) in parallel. I already declared i and numstrain as private variables. Could someone please give me some idea why it is the case? Thank you so much. :)

UPDATE:

I modified the previous version and this version can print out correct result.

integer, allocatable :: numstrain(:)
integer :: allocate_status
integer :: n
!$OMP PARALLEL DO NUM_THREADS(4) &
!$OMP PRIVATE(numstrain, i)
n = 1000000
do irep = 1, nrep
    allocate (numstrain(n), stat = allocate_status)
    do i=1, 10
        PRINT *, numstrain(i)
    end do
    deallocate (numstrain, stat = allocate_status)
end do
!$OMP END PARALLEL DO

However if I move the numstrain accessing to another subroutine called by this subroutine (code attached below), 1. It always processes in one thread. 2. At some point (i=4 or 5), it returns Segmentation Fault:11. The variable i when it returns Segmentation Fault:11 is different when I have different NUM_THREADS.

integer, allocatable :: numstrain(:)
integer :: allocate_status
integer :: n
!$OMP PARALLEL DO NUM_THREADS(4) &
!$OMP PRIVATE(numstrain, i)
n = 1000000
do irep = 1, nrep
    allocate (numstrain(n), stat = allocate_status)
    call anotherSubroutine(numstrain)
    deallocate (numstrain, stat = allocate_status)
end do
!$OMP END PARALLEL DO

subroutine anotherSubroutine(numstrain)
    integer, allocatable   :: numstrain(:)
    do i=1, 10
        PRINT *, numstrain(i)
    end do
end subroutine anotherSubroutine

I also tried to both allocate/deallocate in help subroutine and main subroutine, and only allocate/deallocate in help subroutine. Nothing is changed.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The most typical reason for this is that not enough space is available on the stack to hold the private copy of numstrain. Compute and compare the following two values:

  • the size of the array in bytes
  • the stack size limit

There are two kinds of stack size limits. The stack size of the main thread is controlled by things like process limits on Unix systems (use ulimit -s to check and modify this limit) or is fixed at link time on Windows (recompilation or binary edit of the executable is necessary in order to change the limit). The stack size of the additional OpenMP threads is controlled by environment variables like the standard OMP_STACKSIZE, or the implementation-specific GOMP_STACKSIZE (GNU/GCC OpenMP) and KMP_STACKSIZE (Intel OpenMP).

Note that most Fortran OpenMP implementations always put private arrays on the stack, no matter if you enable compiler options that allocate large arrays on the heap (tested with GNU's gfortran and Intel's ifort).

If you comment out the PRINT statement, you effectively remove the reference to numstrain and the compiler is free to optimise it out, e.g. it could simply not make a private copy of numstrain, thus the stack limit is not exceeded.


After the additional information that you've provided one can conclude, that stack size is not the culprit. When dealing with private ALLOCATABLE arrays, you should know that:

  • private copies of unallocated arrays remain unallocated;
  • private copies of allocated arrays are allocated with the same bounds.

If you do not use numstrain outside of the parallel region, it is fine to do what you've done in your first case, but with some modifications:

integer, allocatable :: numstrain(:)
integer :: allocate_status
integer, parameter :: n = 1000000
interface
   subroutine anotherSubroutine(numstrain)
      integer, allocatable :: numstrain(:)
   end subroutine anotherSubroutine
end interface

!$OMP PARALLEL NUM_THREADS(4) PRIVATE(numstrain, allocate_status)
allocate (numstrain(n), stat = allocate_status)
!$OMP DO
do irep = 1, nrep
   call anotherSubroutine(numstrain)
end do
!$OMP END DO
deallocate (numstrain)
!$OMP END PARALLEL

If you also use numstrain outside of the parallel region, then the allocation and deallocation go outside:

allocate (numstrain(n), stat = allocate_status)
!$OMP PARALLEL DO NUM_THREADS(4) PRIVATE(numstrain)
do irep = 1, nrep
   call anotherSubroutine(numstrain)
end do
!$OMP END PARALLEL DO
deallocate (numstrain)

You should also know that when you call a routine that takes an ALLOCATABLE array as argument, you have to provide an explicit interface for that routine. You can either write an INTERFACE block or you can put the called routine in a module and then USE that module - both cases would provide the explicit interface. If you do not provide the explicit interface, the compiler would not pass the array correctly and the subroutine would fail to access its content.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...