Be not the first by whom the new are tried, nor yet the last to lay the old aside.
Alexander Pope
When I started doing research for my post on JNI, I heard about some newfangled thing called the Foreign Function and Memory API (FFM). Apparently it does all the same things as JNI, but purely in Java code, so you have all the conveniences of modern Java development without all the hassles of compiling and linking two different languages and getting them to play nicely together. After finishing my experiments in JNI, therefore I was excited to give it a try.
Background
The concepts in the FFM have been kicking around for several Java versions, going back at least to Java 17. However, it's nearly finalized in Java 24, although the native-accessing code is still marked experimental and give warnings when compiled without specific flags (--enable-native-access=ALL-UNNAMED
).
There are several blog posts about using FFM, but they all seem to copy the same examples on the official Java website. Thus I was truly on my own this time.
An aside about AI programming aids
Well, not completely on my own. I made extensive use of AI programming aids during this project, particularly a couple of installations of ChatGPT. I have been slow to get on the AI train, and I am still highly skeptical of many of the claims that are made about it. But I freely admit that I could not have completed this project or the JNI project without its help. There is just so much detailed, obscure, and esoteric knowledge about compiling, linking, tool flags, and platform idiosyncrasies that no person can know it all. While my Google searching skills are decent, I don't believe I could have found the answers I needed within the bounds of my patience in order to bring this to a conclusion. While ChatGPT is not perfect (it is limited by published APIs and documentation and can get confused about the requirements of different software versions), it was definitely a big help to me!
The Arena
The basic idea of FFM is that you take over the management of native memory in Java code instead of native code. This starts with an Arena
, which can be opened and disposed of in a try block like any other try-with resource. Also within the Java code you can lay out the memory of structs you'll be using.
GroupLayout RASHUNAL_LAYOUT = MemoryLayout.structLayout(
JAVA_INT.withName("numerator"),
JAVA_INT.withName("denominator")
);
GroupLayout GAUSS_FACTORIZATION_LAYOUT = MemoryLayout.structLayout(
ADDRESS.withName("PI"),
ADDRESS.withName("L"),
ADDRESS.withName("D"),
ADDRESS.withName("U")
);
try (Arena arena = Arena.ofConfined()) {
...
}
MemoryLayout
is an interface with static methods to lay out primitives, structs, arrays, and other entities. The Arena
object is then used to allocate blocks of native memory using a layout as a map.
int[][][] data = ;
int height = data.length;
int width = data[0].width;
int elementCount = height * width;
long elementSize = RASHUNAL_LAYOUT.byteSize();
long elementAlign = RASHUNAL_LAYOUT.byteAlignment();
long totalBytes = elementSize * (long)elementCount;
MemorySegment elems = arena.allocate(totalBytes, elementAlign);
long numOffset = RASHUNAL_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("numerator"));
long denOffset = RASHUNAL_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("denominator"));
for (int i = 0; i < elementCount; ++i) {
int row = i / width;
int col = i % width;
int[] element = data[row][col];
int numerator = element[0];
int denominator = element.length == 1 ? 1 : element[1];
MemorySegment elementSlice = elems.asSlice(i * elementSize, elementSize);
elementSlice.set(JAVA_INT, numOffset, numerator);
elementSlice.set(JAVA_INT, denOffset, denominator);
}
Before with the JNI this was all done in C. Now it's all being done in Java code. It's a lot of steps, and it gets down pretty far into the weeds, but there are advantages to doing it all in Java. Pick your poison.
Native methods are retrieved into the Java code as method handles. They are retrieved by making downcalls (from Java to native methods, as opposed to upcalls, which are calls from native methods to Java code) on a Linker
object. To make the downcall you need the full signature of the native method, with the return value of the call first.
Linker linker = Linker.nativeLinker();
SymbolLookup lookup = OpenNativeLib(arena); // I'll come back to this later
MemorySegment newRMatrixLocation = lookup.find("new_RMatrix").getOrThrow();
MethodHandle new_RMatrix_handle = linker.downcallHandle(newRMatrixLocation, FunctionDescriptor.of(ADDRESS, JAVA_LONG, JAVA_LONG, ADDRESS));
After getting a Linker
object, the native library needs to be opened and brought into the JVM. OpenNativeLib
is a static method I wrote on the utility class this code is coming from, and I'll come back to its details later.
linker.downcallHandle
accepts a MemorySegment
, a FunctionDescriptor
, and a variable-length list of Linker.Option
s. It returns a MethodHandle
that can be used to call into native methods.
The SymbolLookup
returned by OpenNativeLib
is used to search the native library for methods and constants. It's a simple name lookup, and returns an Option
with whatever it finds.
The FunctionDescriptor
is fairly self-explanatory: it's the signature of a native method with constants from java.lang.foreign.ValueLayout
representing the return value and the arguments (return value first, followed by arguments). ADDRESS
is a general value for a C pointer. new_RMatrix
accepts longs representing the height and width of the matrix to be constructed, a pointer to an array of Rashunals, and returns a pointer to the newly-allocated RMatrix.
Once the handle for new_RMatrix
is in hand, it can be called to allocate a new RMatrix:
new_RMatrix_handle.invoke((long) height, (long) width, elems);
// compiles, but blows up when run
Not so fast! elems
represents an array of Rashunal structs laid out in sequence in native memory. But what new_RMatrix
expects is a pointer to an array of Rashunal pointers, not the list of Rashunals themselves. So that array of pointers also needs to be constructed:
MemorySegment ptrArray = arena.allocate(ADDRESS.byteSize() * elementCount, ADDRESS.byteAlignment());
for (int i = 0; i < elementCount; ++i) {
MemorySegment elementAddr = elems.asSlice(i * elementSize, elementSize);
ptrArray.setAtIndex(ADDRESS, i, elementAddr);
}
MemorySegment nativeRMatrix = new_RMatrix_handle.invoke((long) height, (long) width, ptrArray);
In a similar way, I got handles to RMatrix_gelim
to factor the input matrix and RMatrix_height
, RMatrix_width
, and RMatrix_get
to get information about the four matrices in the factorization. There was one wrinkle when getting information about structs returned by pointer from these methods:
MemorySegment factorZero = (MemorySegment) RMatrix_gelim_handle.invoke(rmatrixPtr);
MemorySegment factor = factorZero.reinterpret(GAUSS_FACTORIZATION_LAYOUT.byteSize(), arena, null);
long piOffset = GAUSS_FACTORIZATION_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("PI"));
...
MemorySegment piPtr = factor.get(ADDRESS, piOffset);
...
When a native method returns a pointer to a struct, the handle returns a zero-length memory segment that has no information about the struct pointed to by that memory. It needs to be reinterpret
ed as the struct itself using the MemoryLayout
that corresponds to the struct. Then the struct can be interpreted using offsets in the reverse of the process used to set data.
Then I worked on the code to translate them back to Java objects:
long numeratorOffset = RASHUNAL_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("numerator"));
long denominatorOffset = RASHUNAL_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("denominator"));
long height = (long) RMatrix_height_handle.invoke(mPtr);
long width = (long) RMatrix_width_handle.invoke(mPtr);
JRashunal[] data = new JRashunal[Math.toIntExact(height * width)];
for (long i = 1; i <= height; ++i) {
for (long j = 1; j <= width; ++j) {
MemorySegment elementZero = (MemorySegment) RMatrix_get_handle.invoke(mPtr, i, j);
MemorySegment element = elementZero.reinterpret(RASHUNAL_LAYOUT.byteSize(), arena, null);
int numerator = element.get(JAVA_INT, numeratorOffset);
int denominator = element.get(JAVA_INT, denominatorOffset);
data[Math.toIntExact((i - 1) * width + (j - 1))] = new JRashunal(numerator, denominator);
}
}
JRashunalMatrix jrm = new JRashunalMatrix(Math.toIntExact(height), Math.toIntExact(width), data);
In this way I was able to complete a round trip from Java objects to native code and back.
Missing Link
So how do you load the native code? I thought it would be as simple as the guides say.
var lookup = SymbolLookup.libraryLookup("rmatrix", arena);
Unfortunately, that's not the way it turned out. Many ChatGPT questions and answers followed, but apparently there is a big difference between SymbolLookup.libraryLookup
and
System.loadLibrary("jnirmatrix");
which is how I loaded the native library compiled from the JNI header. That used C tools to find rmatrix and rashunal, which are well-understood and have stood the test of time.
According to ChatGPT, System.loadLibrary
does a lot of additional work on behalf of the programmer, including formatting library names correctly, looking for code in platform-specific locations, and handling symlinks. FFM deliberately dials back on that, so SymbolLookup.libraryLookup
only calls Java code to load libraries and is a pretty thin layer, so it does none of the additional work on programmers' behalf. Adding entries to LD_LIBRARY_PATH
, DYLD_LIBRARY_PATH
, java.library.path
, or anywhere else were completely useless.
The only workaround I could find is to define an environment variable with the complete, absolute path to the library, retrieve that in the Java code, and pass that into SymbolLookup.libraryLookup
:
$ export RMATRIX_LIB=/usr/local/lib/librmatrix.so
$ java ...
String libString = System.getenv("RMATRIX_LIB");
if (libString == null || libString.isBlank()) {
throw new IllegalStateException("Environment variable RMATRIX_LIB needed to load native libraries");
}
Path libPath = Path.of(libString);
if (!Files.isRegularFile(libPath)) {
throw new IllegalArgumentException("Library file not found: " + libPath);
}
System.out.println("Loading native library: " + libPath.toAbsolutePath());
return SymbolLookup.libraryLookup(libPath, arena);
If you are horrified by this level of specificity and handholding, I'm right there with you. It gets worse, though, because on Windows that's not enough. RMatrix refers to Rashunal, but Windows cannot resolve the Rashunal code even if it was compiled and installed the same way as RMatrix. The Rashunal library needs to be on PATH
in order to be findable by the system:
> $env.RMATRIX_LIB=C:/Users/john.todd/local/rmatrix/bin/rmatrix.dll
> $env.PATH+=";C:/Users/john.todd/local/rashunal/bin/rashunal.dll"
> java ...
Are you aghast? It gets worse because there is no way to do it at all on a Mac. Modern versions of MacOS (since OS X El Capitan) have something called System Integrity Protection (SIP), which the developers in Cupertino have wisely put into place to protect us all from ourselves. The Google AI answer for "what is sip macos" says it "Prevents unauthorized code execution: SIP prevents malicious software from running unauthorized code on your Mac", which I guess includes loading dependent libraries from the JVM.
I could load RMatrix using the environment variable trick, but I couldn't load Rashunal from it because RMatrix uses rpaths (relative paths?) to refer to libraries it depends on. rpaths can be supplied in other situations (like the JNI application) by DYLD_LIBRARY_PATH
or DYLD_FALLBACK_LIBRARY_PATH
, but SIP restricts that from working in certain contexts, such as the JVM. I refused to recompile or rewrite the libraries once I had them minimally working on all three platforms because I saw this experiment as a way to call into existing legacy code that couldn't be altered. ChatGPT suggested using install_name_tool
to rewrite the libraries' rpaths as full absolute paths, but (maybe stubbornly at this point) I didn't think I should need to mess around with them to make them loadable.
I saw one link that said something about adding com.apple.security.cs.allow-dyld-environment-variables
entitlement, but that approach sounded even more hideous than anything I've described so far. So I had to just abandon MacOS as an environment I can do this on.
Cleaning up
Like Java's good old garbage collector, the Arena
will clean up any memory directly allocated in it, like the Rashunal array or the pointer array in the code segments above. But memory that is allocated in the native code is opaque to the Java code, and will leak if it's not cleaned up. To do that, you need handles to any library-specific cleanup code or to the stdlib free
method. Note the special-purpose FunctionDescriptor.ofVoid
method to describe native methods that return void:
MemorySegment freeRMatrixLocation = lookup.find("new_RMatrix").orElseThrow();
MethodHandle freeRMatrixHandle = linker.downcallHandle(newRMatrixLocation, FunctionDescriptor.ofVoid(ADDRESS));
var clib = linker.defaultLookup();
MemorySegment freeLocation = clib.find("free").orElseThrow();
MethodHandle freeHandle = linker.downcallHandle(freeLocation, FunctionDescriptor.ofVoid(ADDRESS));
freeRMatrixHandle.invoke(rmatrixPtr);
freeHandle.invoke(rashunalElement);
I briefly looked at using Valgrind to verify that I wasn't leaking anything further. Apparently the JVM itself spawns a lot of false (?) alarms. I grepped the output for any mentions of librmatrix or librashunal and didn't find any, so hopefully this approach doesn't leak too badly.
Reflection
So Linux and Windows are the only operating systems left standing, although with some ugly environment variable hacks. And Windows is worse than Linux since you have to put all the libraries on the path in order for the native code to run. I am very disappointed in MacOS; if there are any more experienced cross-language Mac programmers out there, I'd love to be proven wrong on how to handle this.
I'm also disappointed in the FFM; despite the hype, it seems to me that it's not ready for full production use yet. If you're considering using FFM to call native code, don't yet. If you have to, do it in Linux. If you have to do it in Windows, try to put it in a Linux container and call it from your host.
No comments:
Post a Comment