Ticket #3745 (closed bug: worksforme)

Opened 3 years ago

Last modified 3 years ago

Non-deterministic behavior with FFI

Reported by: gchrupala Owned by:
Priority: high Milestone: 6.12.2
Component: Compiler (FFI) Version: 6.10.4
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: x86_64 (amd64)
Type of failure: Incorrect result at runtime Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets:

Description

I have a simple perceptron learning algorithm in C++ which I am calling from Haskell via FFI. Most of the time it works perfectly, but occasionally the results it produces are not the expected. It seems to happen more often if there is another memory/cpu intensive process running on the machine.

The same never happens when I call the same function from C++, which seems to indicate the problem is related to GHC.

I attach a simplified extract of the code which will show the problem.

I included:

* c_perceptronmodel.cpp, c_perceptronmodel.h : C++ implementation

* test.hs : Haskell program which calls the C++ function via FFI

* test.cpp : Equivalent C++ program which calls the same C++ function

* run.sh : shell script which will repeatedly run a program and check if any consecutive two runs produce different results

* train : data file which the programs process

I compiled using GHC 6.10.4 like this:

ghc-6.10.4  --make -O2 -o test-ghc-6.10.4 test.hs c_perceptronmodel.cpp -lstdc++

To see the problem execute:

./run.sh ./test-ghc-6.10.4

Attachments

ffi.tar.bz2 Download (160.8 KB) - added by gchrupala 3 years ago.

Change History

Changed 3 years ago by gchrupala

  Changed 3 years ago by simonmar

  • priority changed from normal to high
  • milestone set to 6.12.2

Thanks for the report, we'll look into it.

follow-up: ↓ 4   Changed 3 years ago by simonmar

I can't reproduce it here on my laptop (x86/Linux). Some questions:

  • can you make it happen on more than one machine? what platform(s)?
  • do you get different results when the program is compiled with -debug?
  • does valgrind say anything?
  • does it happen with 6.12.1?

Is it possible there's a memory error in the FFI code, or the C++ code? perhaps an off-by-one array size?

  Changed 3 years ago by simonmar

  • priority changed from high to normal

dropping prio while we wait for feedback.

in reply to: ↑ 2   Changed 3 years ago by gchrupala

Hi, thanks for looking into this.

Replying to simonmar:

I can't reproduce it here on my laptop (x86/Linux). Some questions: * can you make it happen on more than one machine? what platform(s)?

I discovered the bug on x86_64/Linux. I have now tried it also on x86/Linux and Intel/Mac, and it doesn't happen on those two.

* do you get different results when the program is compiled with -debug?

No, same nondeterministic behavior.

* does valgrind say anything?

Valgrind reports no errors.

* does it happen with 6.12.1?

Just tried it, and the same problem is there with 6.12.1

Is it possible there's a memory error in the FFI code, or the C++ code? perhaps an off-by-one array size?

I can't be 100% sure there is no subtle bug in the code but I checked carefully for the obvious ones.

Best, -- Grzegorz

  Changed 3 years ago by simonmar

  • priority changed from normal to high
  • architecture changed from Unknown/Multiple to x86_64 (amd64)

  Changed 3 years ago by simonmar

  • status changed from new to closed
  • resolution set to worksforme

I can't reproduce it on x86-64/Linux either. I suspect a hardware problem on your machine - to eliminate this possibility, try to reproduce it on another machine.

Note: See TracTickets for help on using tickets.