Monday, September 29, 2025

Going Native - C#

"I belong to the warrior in whom the old ways have joined the new."

Inscription on the sword wielded by Captain Nathan Algren, The Last Samurai

From the JVM to the CLR

This is the third part in a series on calling native code from high-level languages. I've been interested in making useful code locked away in native libraries more widely available, and took this opportunity to finally look into how it's done.

Here is a description of the native library I'm calling in this series.

After struggling through getting the FFM to work, I wasn't sure to expect from .NET. Nevertheless, that's the next language I'm most familiar with it, so I went ahead and plunged in.

Here is a description of the native library.

The approach I followed is Explicit PInvoke, outlined on the Microsfoft Learn website. That provides good background and outline of the process and alternatives. In reality it was so easy that I got by just with conversations with ChatGPT.

The Basics

I started by declaring structs that mirrored the (public) structs in the native libraries:


[StructLayout(LayoutKind.Sequential)]
private struct Rashunal
{
    public int numerator;
    public int denominator;
}
[StructLayout(LayoutKind.Sequential)]
private struct GaussFactorization
{
    public IntPtr PInverse;
    public IntPtr Lower;
    public IntPtr Diagonal;
    public IntPtr Upper;
}

The attributes indicate that the structs are laid out in memory with one field directly following on the previous one. IntPtr is a generic .NET class for a pointer to some memory location. You'll see it again!

Then the native functions are declared in a simple fashion that matches C#'s variable types, with attributes that declare what library to find it in and what the native method is. The methods (and the class) are declared partial because the implementation is provided by the native code. According to convention the C# function and the native function have the same name, but that's not required.


[LibraryImport("rashunal", EntryPoint = "n_Rashunal")]
private static partial IntPtr n_Rashunal(int numerator, int denominator);

[LibraryImport("rmatrix", EntryPoint = "new_RMatrix")]
private static partial IntPtr new_RMatrix(int height, int width, IntPtr data);

[LibraryImport("rmatrix", EntryPoint = "RMatrix_height")]
private static partial int RMatrix_height(IntPtr m);

[LibraryImport("rmatrix", EntryPoint = "RMatrix_width")]
private static partial int RMatrix_width(IntPtr m);

[LibraryImport("rmatrix", EntryPoint = "RMatrix_get")]
private static partial IntPtr RMatrix_get(IntPtr m, int row, int col);

[LibraryImport("rmatrix", EntryPoint = "RMatrix_gelim")]
private static partial IntPtr RMatrix_gelim(IntPtr m);

Then the native methods can be called alongside normal C# code. I'll go in reverse of the actual process of factoring a matrix using the native code.


public static CsGaussFactorization Factor(Model.CsRMatrix m)
{
    var nativeMPtr = AllocateNativeRMatrix(m);
    var fPtr = RMatrix_gelim(nativeMPtr);
    var f = Marshal.PtrToStructure(fPtr);
    var csF = new CsGaussFactorization
    {
        PInverse = AllocateManagedRMatrix(f.PInverse),
        Lower = AllocateManagedRMatrix(f.Lower),
        Diagonal = AllocateManagedRMatrix(f.Diagonal),
        Upper = AllocateManagedRMatrix(f.Upper),
    };
    NativeStdLib.Free(nativeMPtr);
    NativeStdLib.Free(fPtr);
    return csF;
}

First I call a method to allocate a native matrix (below), and then I call RMatrix_gelim on it, which returns a pointer to a native struct. Since the struct is part of the public native interface it can be unmarshaled into a C# object with the Marshal.PtrToStructure call. Then the native matrix pointers are used to construct managed matrices through the AllocateManagedRMatrix calls (also below). Finally, since the native matrix pointer and the factorization pointer are allocated by the native code, they have to be freed by a call to the native free method. Also see below.


private static IntPtr AllocRashunal(int num, int den)
{
    IntPtr ptr = NativeStdLib.Malloc((UIntPtr)Marshal.SizeOf());
    var value = new Rashunal { numerator = num, denominator = den };
    Marshal.StructureToPtr(value, ptr, false);
    return ptr;
}

private static IntPtr AllocateNativeRMatrix(Model.CsRMatrix m)
{
    int elementCount = m.Height * m.Width;
    IntPtr elementArray = NativeStdLib.Malloc((UIntPtr)(IntPtr.Size * elementCount));
    unsafe
    {
        var pArray = (IntPtr*)elementArray;
        for (int i = 0; i < elementCount; ++i)
        {
            var element = m.Data[i];
            var elementPtr = AllocRashunal(element.Numerator, element.Denominator);
            pArray[i] = elementPtr;
        }
        var rMatrixPtr = new_RMatrix(m.Height, m.Width, elementArray);
        for (int i = 0; i < elementCount; ++i)
        {
            NativeStdLib.Free(pArray[i]);
        }
        NativeStdLib.Free(elementArray);
        return rMatrixPtr;
    }
}

Allocating a native RMatrix required native memory allocations, both for individual Rashunals and also for an array of Rashunal pointers. In a pattern that seems familiar now, I wrapped those calls in a NativeStdLib class that I promise to get to very soon. Allocating a Rashunal involves declaring a managed Rashunal struct, a pointer to a native Rashunal, and marshaling the struct to the pointer in native memory. The unsafe block is needed to treat the block of memory allocated for the pointer array as an actual array, instead of a block of unstructured memory. To get this to compile I had to add True to the PropertyGroup in the project file. Finally, I have to free both the individual allocated native Rashunals and the array of pointers to them, since new_RMatrix makes copies of them all.


private static Model.CsRMatrix AllocateManagedRMatrix(IntPtr m)
{
    int height = RMatrix_height(m);
    int width = RMatrix_width(m);
    var data = new CsRashunal[height * width];
    for (int i = 1; i <= height; ++i)
    {
        for (int j = 1; j <= width; ++j)
        {
            var rPtr = RMatrix_get(m, i, j);
            var r = Marshal.PtrToStructure(rPtr);
            data[(i - 1) * width + (j - 1)] = new CsRashunal { Numerator = r.numerator, Denominator = r.denominator };
            NativeStdLib.Free(rPtr);
        }
    }
    return new Model.CsRMatrix { Height = height, Width = width, Data = data, };
}

After all that, allocating a native RMatrix is not very interesting. The native RMatrix_get method returns a newly-allocated copy of the Rashunal at a position in the RMatrix, so it has to be freed the same way as before.

Ok, finally, as promised, here is the interface to loading the native standard library methods:


using System.Reflection;
using System.Runtime.InteropServices;

namespace CsRMatrix.Engine;

public static partial class NativeStdLib
{
    static NativeStdLib()
    {
        NativeLibrary.SetDllImportResolver(typeof(NativeStdLib).Assembly, ResolveLib);
    }

    private static IntPtr ResolveLib(string libraryName, Assembly assembly, DllImportSearchPath? searchPath)
    {
        if (libraryName == "c")
        {
            if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
                return NativeLibrary.Load("ucrtbase.dll", assembly, searchPath);
            if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
                return NativeLibrary.Load("libc.so.6", assembly, searchPath);
            if (RuntimeInformation.IsOSPlatform(OSPlatform.OSX))
                return NativeLibrary.Load("libSystem.dylib", assembly, searchPath);
        }
        return IntPtr.Zero;
    }

    [LibraryImport("c", EntryPoint = "free")]
    internal static partial void Free(IntPtr ptr);

    [LibraryImport("c", EntryPoint = "malloc")]
    internal static partial IntPtr Malloc(UIntPtr size);
}

The platform-specific switching and filenames are pretty ugly, but neither ChatGPT nor I could find a way around it. At least it's confined to a single method in a single class in the project.

ChatGPT really wanted there to be library-specific ways to free Rashunals and factorizations. Then those methods could be declared and called the same way as the new_* methods. But I remained stubborn and said I didn't want to change the source code of the libraries. I was willing to recompile them as needed, but not to change the source code or the CMake files. Eventually, we found this way of handling the standard native library calls.

Getting the name of the file on Windows and getting this to compile and work was a little challenging. The C# code and the native code need to match exactly in the operating system (obviously), architecture (64-bit vs. 32-bit), and configuration (Debug vs. Release). It took a few more details than what I went through when compiling the JNI code.

Compiling on Windows

Windows is very careful about which library can free memory: it can only free memory that was allocated by the same library. Practically, that meant I needed to make sure I was allocating and freeing memory from the same runtime with the same C runtime model. That meant I needed to compile with the multi-threaded DLL (/MD) instead of the default multi-threaded (/MT) compiler flag. I also needed to use the right filename to link the libraries to. ChatGPT and I thought it was mscvrt initially. So I modified the steps to compile the library and checked its headers, imports, and dependencies. This again is in an x64 Native Tools Command Prompt for VS 2022.


>cmake .. -G "NMake Makefiles" ^
  -DCMAKE_BUILD_TYPE=Release ^
  -DCMAKE_INSTALL_PREFIX=C:/Users/john.todd/local/rashunal ^
  -DCMAKE_C_FLAGS_RELEASE="/MD /O2 /DNDEBUG"
>nmake
>nmake install
>cd /Users/john.todd/local/rashunal/bin
>dumpbin /headers rashunall.dll | findstr machine
            8664 machine (x64)

>dumpbin /imports rashunal.dll | findstr free
                          18 free

>dumpbin /dependents rashunal.dll

I didn't see msvcrt.dll, but did see VCRUNTIME140.DLL instead. ChatGPT said, "Ah, that's okay, that's actually better. msvcrt is the old way, ucrt (Universal CRT) is the new way." Then linking to "ucrtbase" in the NativeStdLib utility class (as shown above) worked.

Like with JNI, I had to add the Rashunal and RMatrix libraries to the PATH, and then it worked!


> $env:PATH += ";C:\Users\john.todd\local\rashunal\bin\rashunal.dll;C:\Users\john.todd\local\rmatrix\bin\rmatrix.dll"
> dotnet run C:\Users\john.todd\source\repos\rmatrix\driver\example.txt
Using launch settings from C:\Users\john.todd\source\repos\GoingNative\CsRMatrix\CsRMatrix\Properties\launchSettings.json...
Reading matrix from C:/Users/john.todd/source/repos/rmatrix/driver/example.txt
Starting Matrix:
[ {-2/1} {1/3} {-3/4} ]
[ {6/1} {-1/1} {8/1} ]
[ {8/1} {3/2} {-7/1} ]


PInverse:
[ {1/1} {0/1} {0/1} ]
[ {0/1} {0/1} {1/1} ]
[ {0/1} {1/1} {0/1} ]


Lower:
[ {1/1} {0/1} {0/1} ]
[ {-3/1} {1/1} {0/1} ]
[ {-4/1} {0/1} {1/1} ]


Diagonal:
[ {-2/1} {0/1} {0/1} ]
[ {0/1} {17/6} {0/1} ]
[ {0/1} {0/1} {23/4} ]


Upper:
[ {1/1} {-1/6} {3/8} ]
[ {0/1} {1/1} {-60/17} ]
[ {0/1} {0/1} {1/1} ]

What's even more exciting is that when I committed this to Github and pulled it down in Linux and MacOS, it also just worked (for MacOS after adding the install directories to DYLIB_LD_PATH, similarly to what I had to do with JNI.)

Optimization

Remembering to free pointers allocated by native code isn't so bad. I had to do it in Java with the FFM and when writing the libraries in the first place. But ChatGPT suggested an optimization to have the CLR do it automatically. After reassuring it many times that the new_*, RMatrix_get, and RMatrix_gelim native methods returned pointers to newly-allocated copies of the relevant entities and not pointers to the entities themselves, it said this was the perfect application of the handler pattern. Who can pass that up?

First I wrote some wrapper classes for the pointers returned from the native code:


internal abstract class NativeHandle : SafeHandle
{
    protected NativeHandle() : base(IntPtr.Zero, ownsHandle: true) { }

    protected NativeHandle(IntPtr existing, bool ownsHandle)
        : base(IntPtr.Zero, ownsHandle)
        => SetHandle(existing);

    public override bool IsInvalid => handle == IntPtr.Zero;

    protected override bool ReleaseHandle()
    {
        NativeStdLib.Free(handle);
        return true;
    }
}

internal sealed class RashunalHandle : NativeHandle
{
    internal RashunalHandle() : base() { }

    internal RashunalHandle(IntPtr existing, bool ownsHandle)
        : base(existing, ownsHandle) { }
}

internal sealed class RMatrixHandle : NativeHandle
{
    internal RMatrixHandle() : base() { }

    internal RMatrixHandle(IntPtr existing, bool ownsHandle)
        : base(existing, ownsHandle) { }
}

internal sealed class GaussFactorizationHandle : NativeHandle
{
    internal GaussFactorizationHandle() : base() { }

    internal GaussFactorizationHandle(IntPtr existing, bool ownsHandle)
        : base(existing, ownsHandle) { }
}

Then I had most of the native and managed code use the handles as parameters and return values instead of the pointers returned by the native code:


[DllImport("rashunal", EntryPoint = "n_Rashunal")]
private static extern RashunalHandle n_Rashunal(int numerator, int denominator);

[DllImport("rmatrix", EntryPoint = "new_RMatrix")]
private static extern RMatrixHandle new_RMatrix(int height, int width, IntPtr data);

[DllImport("rmatrix", EntryPoint = "RMatrix_height")]
private static extern int RMatrix_height(RMatrixHandle m);

[DllImport("rmatrix", EntryPoint = "RMatrix_width")]
private static extern int RMatrix_width(RMatrixHandle m);

[DllImport("rmatrix", EntryPoint = "RMatrix_get")]
private static extern RashunalHandle RMatrix_get(RMatrixHandle m, int row, int col);

[DllImport("rmatrix", EntryPoint = "RMatrix_gelim")]
private static extern GaussFactorizationHandle RMatrix_gelim(RMatrixHandle m);

private static Model.CsRMatrix AllocateManagedRMatrix(RMatrixHandle m)
{
    int height = RMatrix_height(m);
    int width = RMatrix_width(m);
    var data = new CsRashunal[height * width];
    for (int i = 1; i <= height; ++i)
    {
        for (int j = 1; j <= width; ++j)
        {
            using var rPtr = RMatrix_get(m, i, j);
            var r = Marshal.PtrToStructure(rPtr.DangerousGetHandle());
            data[(i - 1) * width + (j - 1)] = new CsRashunal { Numerator = r.numerator, Denominator = r.denominator };
        }
    }
    return new Model.CsRMatrix { Height = height, Width = width, Data = data, };
}

Note the switch from LibraryImport to DllImport on the struct declarations. LibraryImport is newer and more preferred, but for some reason it can't do the automatic marshaling of pointers into handles like DllImport can.

Now there's no need to explicitly free the pointers returned from RMatrix_get, n_Rashunal, n_RMatrix, and RMatrix_gelim. There are still some places where I have to remember to free memory, such as when the array of Rashunal pointers is allocated in AllocRashunal. There are also some calls to ptr.DangerousGetHandle() when I need to marshal a pointer to a struct. I tried to get rid of those, but apparently they are unavoidable.

I didn't like the repeated boilerplate code in the concrete subclasses of NativeHandle. I wanted to just use NativeHandle as a generic, i.e. NativeHandle, but that didn't work. ChatGPT said I needed a concrete class to marshal the native struct into, and that the structs I declared in the adapter wouldn't do it. That's also why the parameterless constructors are needed, for the marshaling code, even though they don't do anything but defer to the base class. So be it.

Reflection

After struggling so much with FFM, I was pleasantly surprised by how easy it was to work with C# and its method of calling native code. Interspersing the native calls with the managed code was pretty fun and easy, especially after refactoring to use handles to automatically dispose of allocated memory. It was a little tricky figuring out when I still had to marshal pointers into structs or vice versa, but the compiler and ChatGPT helped me figure it out pretty quickly.

So far, if given the choice of how to call my native libraries, C# and the CLR is definitely how I would do it.

Code repository

https://github.com/proftodd/GoingNative/tree/main/CsRMatrix

Wednesday, September 17, 2025

Going Native - Foreign Function & Memory API (FFM)

Be not the first by whom the new are tried, nor yet the last to lay the old aside.

Alexander Pope

When I started doing research for my post on JNI, I heard about some newfangled thing called the Foreign Function and Memory API (FFM). Apparently it does all the same things as JNI, but purely in Java code, so you have all the conveniences of modern Java development without all the hassles of compiling and linking two different languages and getting them to play nicely together. After finishing my experiments in JNI, therefore I was excited to give it a try.

For a refresher on the native matrix library, see the section The native code in the introduction to this series.

The concepts in the FFM have been kicking around for several Java versions, going back at least to Java 17. However, it's nearly finalized in Java 24, although the native-accessing code is still marked experimental and give warnings when compiled without specific flags (--enable-native-access=ALL-UNNAMED).

There are several blog posts about using FFM, but they all seem to copy the same examples on the official Java website. Thus I was truly on my own this time.

An aside about AI programming aids

Well, not completely on my own. I made extensive use of AI programming aids during this project, particularly a couple of installations of ChatGPT). I have been slow to get on the AI train, and I am still highly skeptical of many of the claims that are made about it. But I freely admit that I could not have completed this project or the JNI project without its help. There is just so much detailed, obscure, and esoteric knowledge about compiling, linking, tool flags, and platform idiosyncrasies that no person can know it all. While my Google searching skills are decent, I don't believe I could have found the answers I needed within the bounds of my patience in order to bring this to a conclusion. While ChatGPT is not perfect (it is limited by published APIs and documentation and can get confused about the requirements of different software versions), it was definitely a big help to me!

The Arena

The basic idea of FFM is that you take over the management of native memory in Java code instead of native code. This starts with an Arena, which can be opened and disposed of in a try block like any other try-with resource. Also within the Java code you can lay out the memory of structs you'll be using.

GroupLayout RASHUNAL_LAYOUT = MemoryLayout.structLayout(
    JAVA_INT.withName("numerator"),
    JAVA_INT.withName("denominator")
);

GroupLayout GAUSS_FACTORIZATION_LAYOUT = MemoryLayout.structLayout(
    ADDRESS.withName("PI"),
    ADDRESS.withName("L"),
    ADDRESS.withName("D"),
    ADDRESS.withName("U")
);

try (Arena arena = Arena.ofConfined()) {
...    
}

MemoryLayout is an interface with static methods to lay out primitives, structs, arrays, and other entities. The Arena object is then used to allocate blocks of native memory using a layout as a map.


int[][][] data = ;
int height = data.length;
int width = data[0].width;
int elementCount = height * width;

long elementSize = RASHUNAL_LAYOUT.byteSize();
long elementAlign = RASHUNAL_LAYOUT.byteAlignment();
long totalBytes = elementSize * (long)elementCount;
MemorySegment elems = arena.allocate(totalBytes, elementAlign);
long numOffset = RASHUNAL_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("numerator"));
long denOffset = RASHUNAL_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("denominator"));
for (int i = 0; i < elementCount; ++i) {
    int row = i / width;
    int col = i % width;
    int[] element = data[row][col];
    int numerator = element[0];
    int denominator = element.length == 1 ? 1 : element[1];
    MemorySegment elementSlice = elems.asSlice(i * elementSize, elementSize);
    elementSlice.set(JAVA_INT, numOffset, numerator);
    elementSlice.set(JAVA_INT, denOffset, denominator);
}

Before with the JNI this was all done in C. Now it's all being done in Java code. It's a lot of steps, and it gets down pretty far into the weeds, but there are advantages to doing it all in Java. Pick your poison.

Native methods are retrieved into the Java code as method handles. They are retrieved by making downcalls (from Java to native methods) on a Linker object. To make the downcall you need the full signature of the native method, with the return value of the call first.


Linker linker = Linker.nativeLinker();
SymbolLookup lookup = OpenNativeLib("rmatrix", arena); // I'll come back to this later
MemorySegment newRMatrixLocation = lookup.find("new_RMatrix").getOrThrow();
MethodHandle new_RMatrix_handle = linker.downcallHandle(newRMatrixLocation, FunctionDescriptor.of(ADDRESS, JAVA_LONG, JAVA_LONG, ADDRESS));

After getting a Linker object, the native library needs to be opened and brought into the JVM. OpenNativeLib is a static method I wrote on the utility class this code is coming from, and I'll come back to its details later.

linker.downcallHandle accepts a MemorySegment, a FunctionDescriptor, and a variable-length list of Linker.Options. It returns a MethodHandle that can be used to call into native methods.

The SymbolLookup returned by OpenNativeLib is used to search the native library for methods and constants. It's a simple name lookup, and returns an Option with whatever it finds.

The FunctionDescriptor is fairly self-explanatory: it's the signature of a native method with constants from java.lang.foreign.ValueLayout representing the return value and the arguments (return value first, followed by arguments). ADDRESS is a general value for a C pointer. new_RMatrix accepts longs representing the height and width of the matrix to be constructed, a pointer to an array of Rashunals, and returns a pointer to the newly-allocated RMatrix.

Once the handle for new_RMatrix is in hand, it can be called to allocate a new RMatrix:


new_RMatrix_handle.invoke((long) height, (long) width, elems);
// compiles, but blows up when run

Not so fast! elems represents an array of Rashunal structs laid out in sequence in native memory. But what new_RMatrix expects is a pointer to an array of Rashunal pointers, not the array of Rashunals themselves. So that array of pointers also needs to be constructed:


MemorySegment ptrArray = arena.allocate(ADDRESS.byteSize() * elementCount, ADDRESS.byteAlignment());
for (int i = 0; i < elementCount; ++i) {
    MemorySegment elementAddr = elems.asSlice(i * elementSize, elementSize);
    ptrArray.setAtIndex(ADDRESS, i, elementAddr);
}
MemorySegment nativeRMatrix = new_RMatrix_handle.invoke((long) height, (long) width, ptrArray);

In a similar way, I got handles to RMatrix_gelim to factor the input matrix and RMatrix_height, RMatrix_width, and RMatrix_get to get information about the four matrices in the factorization. There was one wrinkle when getting information about structs returned by pointer from these methods:


MemorySegment factorZero = (MemorySegment) RMatrix_gelim_handle.invoke(rmatrixPtr);
MemorySegment factor = factorZero.reinterpret(GAUSS_FACTORIZATION_LAYOUT.byteSize(), arena, null);
long piOffset = GAUSS_FACTORIZATION_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("PI"));
...
MemorySegment piPtr = factor.get(ADDRESS, piOffset);
...

When a native method returns a pointer to a struct, the handle returns a zero-length memory segment that has no information about the struct pointed to by that memory. It needs to be reinterpreted as the struct itself using the MemoryLayout that corresponds to the struct. Then the struct can be interpreted using offsets in the reverse of the process used to set data.

Then I worked on the code to translate them back to Java objects:


long numeratorOffset = RASHUNAL_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("numerator"));
long denominatorOffset = RASHUNAL_LAYOUT.byteOffset(MemoryLayout.PathElement.groupElement("denominator"));
long height = (long) RMatrix_height_handle.invoke(mPtr);
long width = (long) RMatrix_width_handle.invoke(mPtr);
JRashunal[] data = new JRashunal[Math.toIntExact(height * width)];
for (long i = 1; i <= height; ++i) {
    for (long j = 1; j <= width; ++j) {
        MemorySegment elementZero = (MemorySegment) RMatrix_get_handle.invoke(mPtr, i, j);
        MemorySegment element = elementZero.reinterpret(RASHUNAL_LAYOUT.byteSize(), arena, null);
        int numerator = element.get(JAVA_INT, numeratorOffset);
        int denominator = element.get(JAVA_INT, denominatorOffset);
        data[Math.toIntExact((i - 1) * width + (j - 1))] = new JRashunal(numerator, denominator);
    }
}
JRashunalMatrix jrm = new JRashunalMatrix(Math.toIntExact(height), Math.toIntExact(width), data);

The offsets are the memory offset within the struct of the field of interest, in this case, the numerator and denominator of the Rashunal struct.

In this way I was able to complete a round trip from Java objects to native code and back.

Missing Link

So how do you load the native code? I thought it would be as simple as the guides say.


var lookup = SymbolLookup.libraryLookup("rmatrix", arena);

Unfortunately, that's not the way it turned out. Many ChatGPT questions and answers followed, but apparently there is a big difference between SymbolLookup.libraryLookup and


System.loadLibrary("jnirmatrix");

which is how I loaded the native library compiled from the JNI header. That used C tools to find rmatrix and rashunal, which are well-understood and have stood the test of time.

According to ChatGPT, System.loadLibrary does a lot of additional work on behalf of the programmer, including formatting library names correctly, looking for code in platform-specific locations, and handling symlinks. FFM deliberately dials back on that, so SymbolLookup.libraryLookup only calls Java code to load libraries. The Javadoc for SymbolLookup.libraryLookup says it defers to dlopen on POSIX systems and LoadLibrary on Windows systems. This searches the path and some environment variables for libraries, but does none of the name enhancements (libLib.so or libLib.dylib or lib.dll) that System.loadLibrary does. This made a bad first impression, but system-specific code turns out to be the way to do it in .NET too, so it's not too bad. /usr/local/lib is on the search path in Linux, but I installed the libraries in a nonstandard location on Windows, so I had to add those to PATH.


String osSpecificLibrary;
String osName = System.getProperty("os.name");
if (osName.contains("Linux")) {
    osSpecificLibrary = "lib" + library + ".so";
} else if (osName.contains("Mac OS")) {
    osSpecificLibrary = "lib" + library + ".dylib";
} else if (osName.contains("Windows")) {
    osSpecificLibrary = library + ".dll";
} else {
    throw new IllegalStateException("Unsupported OS: " + osName);
}
return SymbolLookup.libraryLookup(osSpecificLibrary, arena);

> $env.PATH+=";C:/Users/john.todd/local/rmatrix/bin/rmatrix.dll;C:/Users/john.todd/local/rashunal/bin/rashunal.dll"
> ./gradlew ...

Trying to get this to work on a Mac was an odyssey on its own. Modern versions of MacOS (since OS X El Capitan) have something called System Integrity Protection (SIP), which the developers in Cupertino have wisely put into place to protect us all from ourselves. The Google AI answer for "what is sip macos" says it "Prevents unauthorized code execution: SIP prevents malicious software from running unauthorized code on your Mac", which I guess includes loading dependent libraries from the JVM.

I could load RMatrix using an absolute path to the dylib, but I couldn't load Rashunal from there because RMatrix uses rpaths (relative paths?) to refer to libraries it depends on. rpaths can be supplied in other situations (like the JNI application) by DYLD_LIBRARY_PATH or DYLD_FALLBACK_LIBRARY_PATH, but SIP restricts that from working in certain contexts, such as the JVM (invoked in a particular way). After many big detours into rewriting rpaths to loader_paths or absolute paths and granting the JVM entitlements that allowed loading paths from DYLD_LIBRARY_PATH I finally discovered that java and /usr/bin/java on my Mac are not the same as /Library/Java/JavaVirtualMachines/jdk-24.jdk/Contents/Home/bin/java. Specifically, the first two have the SIP restrictions, but the last one doesn't and it just works with the osSpecificLibrary defined above. Having already spent a lot of time trying to discover how to bypass SIP I wasn't going to look any further into how to get the /usr/bin/java shim to work. So the following command worked from the command line in Mac. Gradle could probably be convinced to do it too, but it didn't by default and I wasn't interested in investigating this any further.


$ /Library/Java/JavaVirtualMachines/jdk-24.jdk/Contents/Home/bin/java \
  -cp app/build/classes/java/main \
  --enable-native-access=ALL-UNNAMED \
  org.jtodd.ffm.ffmrmatrix.App \
  /Users/john/workspace/rmatrix/driver/example.txt
Input matrix:
[ {-2} {1/3} {-3/4} ]
[ {6} {-1} {8} ]
[ {8} {3/2} {-7} ]


PInverse:
[ {1} {0} {0} ]
[ {0} {0} {1} ]
[ {0} {1} {0} ]


Lower:
[ {1} {0} {0} ]
[ {-3} {1} {0} ]
[ {-4} {0} {1} ]


Diagonal:
[ {-2} {0} {0} ]
[ {0} {17/6} {0} ]
[ {0} {0} {23/4} ]


Upper:
[ {1} {-1/6} {3/8} ]
[ {0} {1} {-60/17} ]
[ {0} {0} {1} ]

Cleaning up

Like Java's good old garbage collector, the Arena will clean up any memory directly allocated in it, like the Rashunal array or the pointer array in the code segments above. But memory that is allocated in the native code is opaque to the Java code, and will leak if it's not cleaned up. To do that, you need handles to any library-specific cleanup code or to the stdlib free method. FFM has a special Linker method to look up the language standard libraries, and note the special-purpose FunctionDescriptor.ofVoid method to describe native methods that return void:


MemorySegment freeRMatrixLocation = lookup.find("new_RMatrix").orElseThrow();
MethodHandle freeRMatrixHandle = linker.downcallHandle(newRMatrixLocation, FunctionDescriptor.ofVoid(ADDRESS));

var clib = linker.defaultLookup();
MemorySegment freeLocation = clib.find("free").orElseThrow();
MethodHandle freeHandle = linker.downcallHandle(freeLocation, FunctionDescriptor.ofVoid(ADDRESS));

freeRMatrixHandle.invoke(rmatrixPtr);
freeHandle.invoke(rashunalElement);

I briefly looked at using Valgrind to verify that I wasn't leaking anything further. Apparently the JVM itself spawns a lot of false (?) alarms. I grepped the output for any mentions of librmatrix or librashunal and didn't find any, so hopefully this approach doesn't leak too badly.

Reflection

My first impression of FFM was pretty bad. I had to do a lot more investigating and ChatGPT querying to get this to work on all my platforms than I did with JNI. I'm not sure if any further improvements to Java, FFM, or the operating systems will take away some of the pain. Maybe just time, experience, and more bloggers will make this easier for future developers.

It is nice being able to write all your marshaling and unmarshaling code in a single language, rather than having to write both Java and C code to do it. Nevertheless, an FFM developer still needs to keep C concepts in mind, particularly freeing natively-allocated memory and linking to the libraries. But that seems to be the common thread when connecting to native code.

Code repository

https://github.com/proftodd/GoingNative/tree/main/ffm_rmatrix

Monday, September 8, 2025

Going Native - Java Native Interface (JNI)

Why do humans like old things?

Dr. Noonien Soong, Star Trek the Next Generation

Although I haven't actually done it very much, I've always been fascinated by the idea of calling into old code from modern applications. Who knows what value is locked away in those old libraries? Graphics, matrix calculations, statistics, quantum mechanical calculations, etc. I want to be able to do it all!

In reality, old code is probably dusty, unmaintained, and harder to use than modern code. I'm still interested in being able to access it.

As discussed in the introduction to this series, I wrote a small library to do matrix calculations on rational numbers so that I could focus on the calculations without worrying about data loss or errors due to rounding. This post is about calling into it from Java via the Java Native Interface (JNI).

Starting point

To get a basic education I started with Baeldung's Tutorial on JNI. This gave a few basic examples of how to write the Java code, compile it, generate the C header file, write and compile the implementation of that, and call it all together. In particular, the Using Objects and Calling Java Methods From Native Code section was a good introduction to generating Java objects from the native side of the world.

I did most of this work in an Ubuntu image in a WSL on a Windows machine. I used Java 11 SDK for the Java compilation steps.

Calling RMatrix from Java, creating Java objects from the results

I wrote some simple Java classes as counterparts of the C structs. Then I wrote a simple driver to create small matrix, call the Gauss Factorization method, and display the U matrix. To call the native method I decided to pass a three-dimensional integer array. The first two dimensions represent the height and width of the matrix. The third dimension is either a one- or two-element array representing the numerator and denominator of a rational number, with a one-dimensional array representing a denominator of 1. This matches well with the behavior of String.split("/").


package org.jtodd.jni;

public class JRashunal {
    private int numerator;
    private int denominator;

    public JRashunal(int numerator, int denominator) {
        this.numerator = numerator;
        this.denominator = denominator;
    }

    @Override
    public String toString() {
        if (denominator == 1) {
            return String.format("{%d}", numerator);
        } else {
            return String.format("{%d/%d}", numerator, denominator);
        }
    }
}

package org.jtodd.jni;

public class JRashunalMatrix {
    private int height;
    private int width;
    private JRashunal[] data;

    public JRashunalMatrix(int height, int width, JRashunal[] data) {
        this.height = height;
        this.width = width;
        this.data = data;
    }

    @Override
    public String toString() {
        StringBuilder builder = new StringBuilder();
        for (int i = 0; i < height; ++i) {
            builder.append("[ ");
            for (int j = 0; j < width; ++j) {
                builder.append(data[i * width + j]);
                builder.append(" ");
            }
            builder.append("]\n");
        }
        return builder.toString();
    }
}

package org.jtodd.jni;

public class RMatrixJNI {
    
    static {
        System.loadLibrary("jnirmatrix");
    }

    public static void main(String[] args) {
        RMatrixJNI app = new RMatrixJNI();
        int data[][][] = {
            { { 1    }, { 2 }, { 3, 2 }, },
            { { 4, 3 }, { 5 }, { 6    }, },
        };
        JRashunalMatrix u = app.factor(data);
        System.out.println(u);
    }

    private native JRashunalMatrix factor(int data[][][]);
}

The Java code is compiled and the C header file is generated in the same step:


$ javac -cp build -h build -d build RMatrixJNI.java JRashunal.java JRashunalMatrix.java

Other blogs, including Baeldung's, show you what a JNI header file looks like, so I won't copy it all here. The most important line is the declaration of the method defined in the Java class:


JNIEXPORT jobject JNICALL Java_org_jtodd_jni_RMatrixJNI_factor
  (JNIEnv *, jobject, jobjectArray);

This is the method that you have to implement in the code you write. Include the header file generated by the Java compiler, as well as the Rashunal and RMatrix libraries. I named the C file the same as the header file generated by the compiler.


#include "rashunal.h"
#include "rmatrix.h"
#include "org_jtodd_jni_RMatrixJNI.h"

JNIEXPORT jobject JNICALL Java_org_jtodd_jni_RMatrixJNI_factor (JNIEnv *env, jobject thisObject, jobjectArray jdata)
{
  ...
}

After I got this far, Baeldung couldn't help me much anymore. I turned to the full list of functions defined in the JNI specification. This let me get the dimensions of the Java array and allocate the array of C Rashunals:


    long height = (long)(*env)->GetArrayLength(env, jdata);
    jarray first_row = (*env)->GetObjectArrayElement(env, jdata, 0);
    long width = (long)(*env)->GetArrayLength(env, first_row);

    size_t total = height * width;
    Rashunal **data = malloc(sizeof(Rashunal *) * total);

It took some fiddling, but then I figured out how to get data from the elements of the 2D array, create C Rashunals, create the C RMatrix, and factor it:


    for (size_t i = 0; i < total; ++i) {
        size_t row_index = i / width;
        size_t col_index = i % width;
        jarray row = (*env)->GetObjectArrayElement(env, jdata, row_index);
        jarray jel = (*env)->GetObjectArrayElement(env, row, col_index);
        long el_count = (long)(*env)->GetArrayLength(env, jel);
        jint *el = (*env)->GetIntArrayElements(env, jel, JNI_FALSE);
        int numerator = (int)el[0];
        int denominator = el_count == 1 ? 1 : (int)el[1];
        data[i] = n_Rashunal(numerator, denominator);
    }
    RMatrix *m = new_RMatrix(height, width, data);
    Gauss_Factorization *f = RMatrix_gelim(m);

    const RMatrix *u = f->u;
    size_t u_height = RMatrix_height(u);
    size_t u_width = RMatrix_width(u);

The really tricky part was finding the Java class and constructor definitions from within the native code. The JNI uses something called descriptors to refer to primitives and objects:

  • The descriptors for primitives are single letters: I for integer, Z for boolean, etc.
  • The descriptor for a class is the fully-qualified class name, preceded by an L and trailed by a semicolon: Lorg/jtodd/jni/JRashunal;.
  • The descriptor for an array is the primitive/class descriptor preceded by an opening bracket: [I, [Lorg/jtodd/jni/JRashunal;. Multidimensional arrays add an opening bracket for each dimension of the array.
  • The descriptor for a method is the argument descriptors in parentheses, followed by the descriptor of the return value.
    • You can see an example of this in the header file generated by the Java compiler for the Java class: it accepts a three-dimensional array of integers and returns a JRashunalMatrix, so the signature is ([[[I)Lorg/jtodd/jni/JRashunalMatrix;.
  • If a method has multiple arguments, the descriptors are concatenated with no delimiter. This caused me a lot of grief because I couldn't find any documentation about it. ChatGPT finally gave me the clue to this. It also told me a handy tool to find the method signature of a compiled class: javap -s -p fully.qualified.ClassName.
So in our native code we first need to find the class descriptions, then we need to find the constructors for those classes. The documentation for the GetMethodID says the name of the constructor is , and the return type is void (V):

    jclass j_rashunal_class = (*env)->FindClass(env, "org/jtodd/jni/JRashunal");
    jclass j_rmatrix_class = (*env)->FindClass(env, "org/jtodd/jni/JRashunalMatrix");
    jmethodID j_rashunal_constructor = (*env)->GetMethodID(env, j_rashunal_class, "", "(II)V");
    jmethodID j_rmatrix_constructor = (*env)->GetMethodID(env, j_rmatrix_class, "", "(II[Lorg/jtodd/jni/JRashunal;)V");

(Those II's look like the Roman numeral 2!)

That was the hard part. Although the syntax is ugly, allocating and populating an array of JRashunals and creating a JRashunalMatrix was pretty straightforward:


    jobjectArray j_rashunal_data = (*env)->NewObjectArray(env, u_height * u_width, j_rashunal_class, NULL);
    for (size_t i = 0; i < total; ++i) {
        const Rashunal *r = RMatrix_get(u, i / width + 1, i % width + 1);
        jobject j_rashunal = (*env)->NewObject(env, j_rashunal_class, j_rashunal_constructor, r->numerator, r->denominator);
        (*env)->SetObjectArrayElement(env, j_rashunal_data, i, j_rashunal);
        free((Rashunal *)r);
    }
    jobject j_rmatrix = (*env)->NewObject(env, j_rmatrix_class, j_rmatrix_constructor, RMatrix_height(u), RMatrix_width(u), j_rashunal_data);

Compiling, linking, and running

Up to now I've assumed you understand the basics of C syntax, compiling, linking, and running. I won't assume that for the rest of this because it got pretty tricky and took me a while to figure it out.

I've laid out my project like this:


$ tree .
.
├── JRashunal.java
├── JRashunalMatrix.java
├── RMatrixJNI.java
├── build
│   ├── all generated and compiled code
└── org_jtodd_jni_RMatrixJNI.c

4 directories, 31 files

I set `JAVA_HOME` to the root of the Java 11 SDK I'm using. To compile the C file:


$ echo $JAVA_HOME
/usr/lib/jvm/java-11-openjdk-amd64
$ cc -c -fPIC \
  -Ibuild \
  -I${JAVA_HOME}/include \
  -I${JAVA_HOME}/include/linux \
  org_jtodd_jni_RMatrixJNI.c \
  -o build/org_jtodd_jni_RMatrixJNI.o

Adjust the includes to find the JNI header files for your platform. If you installed Rashunal and RMatrix to a recognized location (/usr/local/include for me) the compiler should find them on its own. If not, add includes to them as well.


$ cc -shared -fPIC -o build/libjnirmatrix.so build/org_jtodd_jni_RMatrixJNI.o -L/usr/local/lib -lrashunal -lrmatrix -lc

To create the shared library you have to link in the Rashunal and RMatrix libraries, hence the additional link location and link switches.


$ LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH java -cp build \
  -Djava.library.path=/home/john/workspace/JavaJNI/build org.jtodd.jni.RMatrixJNI
[ {1} {2} {3/2} ]
[ {0} {1} {12/7} ]

This is the tricky one. Since we created a shared library (libjnirmatrix.so), we need to provide the runtime the path to the linked Rashunal and RMatrix libraries. This isn't done through the java.library.path variable; that tells the JVM where to find the JNI header file. You need the system-specific load path to tell the system runtime (not the JVM runtime) where to find the linked libraries. On Linux and MacOS that's the LD_LIBRARY_PATH variable. Thanks again, ChatGPT!

Whew, that's a lot of work! And who wants to type those absurdly long CLI commands?

Doing it in a modern build system: Gradle

I'm continuing in my Ubuntu WSL shell with Java 11 and Gradle 8.3.

$ mkdir jrmatrix
$ gradle init
# I chose a basic project with Groovy as the DSL and the new APIs

Apparently Gradle's Java and native plugins don't play nicely in the same project, so the first thing I did was separate the project into app (Java) and native (C) subprojects. All of the automatically-generated files could stay the way they were. I just needed to make a small change to settings.gradle:


rootProject.name = 'jrmatrix'

include('app', 'native')

Then I made folders for each subproject.

In the app subproject I created the typical Java folder structure:


$ tree .
.
├── build.gradle
└── src
    └── main
        └── java
            └── org
                └── jtodd
                    └── jni
                        ├── JRashunal.java
                        ├── JRashunalMatrix.java
                        ├── RMatrixJNI.java
                        └── jrmatrix
                            └── App.java

26 directories, 11 files

The build.gradle file in app needed switches to tell the Java compiler to generate the JNI header file and where to put it. This is no longer a separate command (javah), but is an additional switch on the javac command. In addition, I wanted the run task to depend on the native compilation and linking steps I'll describe later in the native subproject.


plugins {
    id 'application'
    id 'java'
}

tasks.named("compileJava") {
    def headerDir = file("${buildDir}/generated/jni")
    options.compilerArgs += ["-h", headerDir.absolutePath]
    outputs.dir(headerDir)
}

application {
    mainClass = 'org.jtodd.jni.jrmatrix.App'
    applicationDefaultJvmArgs = [
        "-Djava.library.path=" + project(":native").layout.buildDirectory.dir("libs/shared").get().asFile.absolutePath
    ]
}

tasks.named("run") {
    dependsOn project(":native").tasks.named("linkJni")
}

Most of the fun was in the native project. I set it up with a similar folder structure to Java projects:


$ tree .
.
├── build.gradle
└── src
    └── main
        └── c
            └── org_jtodd_jni_RMatrixJNI.c

6 directories, 4 files

Gradle has a cpp-library plugin, but it seems tailored just to C++, not to C. There is also a c-library plugin, but that seems not to be bundled with Gradle 8.3, so I decided to skip it. The other alternative was a bare-bones c plugin, which apparently is pretty basic. Much of the code will look similar to what I did earlier at the command line.

Like I did there, I had to write separate steps to compile and link the implementation of the JNI header file with the Rashunal and RMatrix libraries. After a couple of refactorings to pull out common definitions of the includes and not hardcoding the names of C source files I wound up with this:


apply plugin: 'c'

def jniHeaders = { project(":app").layout.buildDirectory.dir("generated/jni").get().asFile }
def jvmHome = { file(System.getenv("JAVA_HOME")) }
def outputDir = { file("$buildDir/libs/shared") }

def osSettings = {
    def os = org.gradle.nativeplatform.platform.internal.DefaultNativePlatform.currentOperatingSystem
    def baseInclude = new File(jvmHome(), "/include")
    def includeOS
    def libName
    if (os.isLinux()) {
        includeOS = new File(jvmHome(), "/include/linux")
        libName = "libjnirmatrix.so"
    } else if (os.isMacOsX()) {
        includeOS = new File(jvmHome(), "/include/darwin")
        libName = "libjnirmatrix.dylib"
    } else if (os.isWindows()) {
        includeOS = new File(jvmHome(), "/include/win32")
        libName = "jnirmatrix.dll"
    } else if (os.isFreeBSD()) {
        includeOS = new File(jvmHome(), "/include/freebsd")
        libName = "libjnirmatrix.so"
    } else {
        throw new GradleException("Unsupported OS: $os")
    }
    [baseInclude, includeOS, libName]
}

def sourceDir = file("src/main/c")
def cSources = fileTree(dir: sourceDir, include: "**/*.c")
def objectFiles = cSources.files.collect { file ->
    new File(outputDir(), file.name.replaceAll(/\.c$/, ".o")).absolutePath
}

tasks.register('compileJni', Exec) {
    dependsOn project(":app").tasks.named("compileJava")
    outputs.dir outputDir()
    doFirst { outputDir().mkdirs() }

    def (baseInclude, includeOS, _) = osSettings()

    def compileArgs = cSources.files.collect { file ->
        [
            '-c',
            '-fPIC',
            '-I', jniHeaders().absolutePath,
            '-I', baseInclude.absolutePath,
            '-I', includeOS.absolutePath,
            file.absolutePath,
            '-o', new File(outputDir(), file.name.replaceAll(/\.c$/, ".o")).absolutePath
        ]
    }.flatten()

    commandLine 'gcc', *compileArgs
}

tasks.register('linkJni', Exec) {
    dependsOn tasks.named("compileJni")
    outputs.dir outputDir()
    doFirst { outputDir().mkdirs() }

    def (baseInclude, includeOS, libName) = osSettings()

    commandLine 'gcc',
        '-shared',
        '-fPIC',
        '-o', new File(outputDir(), libName).absolutePath,
        *objectFiles,
        '-I', jniHeaders().absolutePath,
        '-I', baseInclude.absolutePath,
        '-I', includeOS.absolutePath,
        '-L', '/usr/local/lib',
        '-l', 'rashunal',
        '-l', 'rmatrix',
        '-Wl,-rpath,/usr/local/lib'
}

tasks.named('build') {
    dependsOn tasks.named('compileJni')
    dependsOn tasks.named('linkJni')
}

Gradle subprojects have references to each other, so this library can get references to app's output directory to reference the JNI header file. The compileJni task is set to depend on app's compileJava task, and native's build task is set to depend on the compileJni and linkJni tasks defined in this file.

This worked if I explicitly called app's compileJava task and native's build task, but it failed after a clean task. It turned out Java's compile task wouldn't detect the deletion of the JNI header file as a change that required rebuilding, so I added the build directory as an output to the task (outputs.dir(headerDir)). Thus deleting that file (or cleaning the project) caused recompilation and rebuilding.

The nice thing is that this runs with a single command now (`./gradlew run`). Much nicer than entering all the command line commands by hand!

Reflection

As expected, this works but is very fragile. Particularly calling Java code from the native code depends on knowledge of the class and method signatures. If those change in the Java code, the project will compile and start just fine, but blow up pretty explosively and nastily with unclear explanations at runtime.

I was surprised by Gradle's basic tooling for C projects. I thought there would be more help than paralleling the command line so closely. I'll have to look into the `c-library` plugin to see if it offers any more help. I'm also surprised by how few blogs and Stack Overflow posts I found about this: apparently this isn't something very many people do (or live to tell the tale!).

Update

Turns out it is possible to compile C code with the cpp-library plugin, and it is a little more user-friendly than the bare bones C plugin.

I needed a common way to refer to the operating system name, so I put a library function in the root build.gradle file:


ext {
    // Normalize OS name into what Gradle's native plugin actually uses
    normalizedOsName = {
        def os = org.gradle.internal.os.OperatingSystem.current()
        if (os.isWindows()) {
            return "windows"
        } else if (os.isLinux()) {
            return "linux"
        } else if (os.isMacOsX()) {
            return "macos"
        } else if (os.isUnix()) {
            return "unix"
        } else {
            throw new GradleException("Unsupported OS: $os")
        }
    }
}

Then I can refer to it in app/build.gradle:


def osName = rootProject.ext.normalizedOsName()
def buildType = (project.findProperty("nativeBuildType") ?: "debug")

application {
    mainClass = 'org.jtodd.jni.jrmatrix.App'

    applicationDefaultJvmArgs = [
        "-Djava.library.path=${project(":native").layout.buildDirectory.dir("lib/main/${buildType}/${osName}").get().asFile.absolutePath}"
    ]
}

tasks.named("run") {
    def nativeLibs = project(":native").layout.buildDirectory.dir("lib")

    dependsOn(":native:assemble")
}

The buildType and osName variables were required because the native plugin puts the library in locations that depend on them.

native/build.gradle was completely rewritten:


plugins {
    id 'cpp-library'
}

library {
    linkage.set([Linkage.SHARED])
    targetMachines = [
        machines.windows.x86_64,
        machines.macOS.x86_64,
        machines.linux.x86_64,
    ]
    baseName = "jnirmatrix"

    binaries.configureEach {
        def compileTask = compileTask.get()
        compileTask.dependsOn(project(":app").tasks.named("compileJava"))

        compileTask.source.from fileTree(dir: "src/main/c", include: "**/*c")

        def jvmHome = System.getenv("JAVA_HOME")
        compileTask.includes.from(file("$jvmHome/include"))
        compileTask.includes.from(project(":app").layout.buildDirectory.dir("generated/sources/headers/java/main"))

        def os = org.gradle.internal.os.OperatingSystem.current()
        if (os.isWindows()) {
            compileTask.includes.from("$jvmHome/include/win32")
            compileTask.includes.from(file("C:/headers/rashunal/include"))
            compileTask.includes.from(file("C:/headers/rmatrix/include"))
            compileTask.compilerArgs.add("/TC")
        } else if (os.isLinux()) {
            compileTask.includes.from(file("$jvmHome/include/linux"))
            compileTask.compilerArgs.addAll(["-x", "c", "-fPIC", "-std=c11"])
        } else if (os.isMacOsX()) {
            compileTask.includes.from(file("$jvmHome/include/darwin"))
            compileTask.compilerArgs.addAll(["-x", "c", "-fPIC", "-std=c11"])
        } else if (os.isUnix()) {
            compileTask.includes.from(file("$jvmHome/include/freebsd"))
            compileTask.compilerArgs.addAll(["-x", "c", "-fPIC", "-std=c11"])
        } else {
            throw new GradleException("Unsupported OS for JNI build: $os")
        }

        def linkTask = linkTask.get()
        if (toolChain instanceof GccCompatibleToolChain) {
            linkTask.linkerArgs.addAll([
                "-L/usr/local/lib",
                "-lrashunal",
                "-lrmatrix",
                "-Wl,-rpath,/usr/local/lib"
            ])
        } else if (toolChain instanceof VisualCpp) {
            linkTask.linkerArgs.addAll([
                "C:/libs/rashunal.lib",
                "C:/libs/rmatrix.lib"
            ])
        }
    }
}

def osName = rootProject.ext.normalizedOsName().capitalize()
def buildType = (project.findProperty("nativeBuildType") ?: "debug").capitalize()
def targetTaskName = "link${buildType}${osName}"

tasks.named("assemble") {
    dependsOn tasks.named(targetTaskName)
}

The plugin is cpp-library, not cpp-application because it is building a shared library, not an application. That might have been the problem I had before.

I set the linkage to shared (not static), and the machines I'm targeting. Then I set the base name of the shared library.

The binaries configuration has a dependency on the compile task of the app library, and it gets a list of source files.

JAVA_HOME is queried and the header files common to all tasks are set. Then additional headers and compiler flags are set based on operating system. Then linker arguments are set, again based on operating system.

Finally, the build type and operating system name are used to set the assemble task. This determines the location of the shared library (build/lib/main/[debug|release]/[linux|macos|windows]).

Details on Windows compilation

Compiling and linking was especially complicated on Windows. Specifically, the JNI implementation and the target libraries had to match exactly in CPU architecture (32-bit vs. 64-bit) and release configuration (Release vs. Debug). It took a while and a lot of back and forth with ChatGPT to figure it out.

  1. Open a 64-bit specific Visual Studio developer window.
    • Building for 64-bit is not the default in Windows, and NMake doesn't allow you to set the architecture when it's invoked. Hence the specific window to do it.
    • In the Windows Search bar start typing "x64". Choose "x64 Native Tools Command Prompt for VS 2022".
  2. Make a build directory in the native project. To distinguish it from any ordinary development directory I called it build-release. Change directories into it.
  3. CMake the project. On Windows NMake is most like GNU make, and it comes preinstalled with Visual Studio.
  4. 
    >cmake .. -G "NMake Makefiles" ^
      -DCMAKE_BUILD_TYPE=Release ^
      -DCMAKE_INSTALL_PREFIX=C:/Users/john.todd/local/rashunal ^
      -DCMAKE_C_FLAGS_RELEASE="/MD /O2 /DNDEBUG"
    >nmake
    >nmake install
    
  5. To verify the architecture use dumpbin to check the headers of the created dll.
  6. 
    >cd /Users/john.todd/local/rashunal/bin
    >dumpbin /headers rashunall.dll | findstr machine
                8664 machine (x64)
    
  7. Finally, add the complete paths to the DLLs to PATH, specify to build the native code as Release, and call the Java class. (The enable-native-access switch isn't required, but it does suppress some warnings.)
  8. 
    > $env:PATH += ";C:\Users\john.todd\local\rashunal\bin;C:\Users\john.todd\local\rmatrix\bin"
    > ./gradlew clean
    > ./gradlew build -PnativeBuildType=Release
    > ./gradlew run --args="C:/Users/john.todd/source/repos/rmatrix/driver/example.txt"
    

After all that it finally worked on Windows, joining Linux and MacOS. Not quite as nice, but at least it completes the big three operating systems.

https://github.com/proftodd/GoingNative/tree/main/jrmatrix

Thursday, September 4, 2025

Going Native - Calling Native Code

For a long time I've been interested in methods to call native code (which I've learned almost always means C and/or C++) from modern languages. Ever since I started working with Java I heard about something called JNI and that it could be used to unlock ancient mysteries hidden away in native code. At the end of this summer I had some time and finally decided to give it a try.

This will be a series of articles on calling native code (C and/or C++) from modern development languages. For now that means Java, Python, and C#. This post introduces the native library I wrote to demonstrate this process, while follow-on posts will demonstrate calling it from other languages.

The native code

Another subject I've been interested in for a long time is linear algebra, specifically matrices and their uses. To try to get a better understanding of them I wrote a couple of small C libraries using them. My linear algebra guide is Linear Algebra and Its Applications (3rd ed) by Gilbert Strang.

One is a basic library to handle the arithmetic of rational numbers: Rashunal.

The other is a library to handle matrices of Rashunals and some basic operations on them (add, multiply, row operations, Gauss factoring). Todos include matrix inversion, minors, calculation of determinants. Finding eigenvalues and eigenvectors requires solving high-degree polynmial equations, so I'm not sure I'll get to that. RMatrix.

I wrote these to the best of my ability using CMake, Unity, and Valgrind. They should be cross-platform and installable on any system (Linux,MacOS, Windows). They have basic unit tests and have been checked for memory integrity. However, they should not be used for any production environment. Do not build your great AI model or statistics package on them!

Prerequisites and Setup

  • CMake
  • Make - Installation is different on different systems
    • Linux - almost certainly available on your distribution's package installer
      • Debian - sudo apt update && sudo apt install make
      • Red Hat - sudo yum update && sudo yum install make
    • MacOS - brew install make
    • Windows - choco install make
  • Your favorite C compiler
Clone the repositories above. The READMEs have instructions on how to compile and install the library code. Briefly:

$ cd Rashunal
$ cmake -S . -B build
$ cd build
$ make && make test && make install
$ cd ..
$ cd rmatrix
$ cmake -S . -B build
$ cd build
$ make && make test && make install

Approach

In each language I target, I'd like to write an application that can read a data file representing a matrix of rational numbers, send it to the native code, do something fairly complicated with it, and return it to the caller in some useful form. To demonstrate I wrote a model application in the RMatrix repository. It reads a matrix in a simple space-delimited format, converts it to an RMatrix, factors it into P-1, L, D, and U matrices, and displays the L and U matrices in the terminal window.


$ cd driver
$ cat example.txt
-2  1/3 -3/4
 6 -1    8
 8  3/2 -7
$ make
$ cat example.txt | ./driver
Height is 3
Width is 3
L:
[  1 0 0 ]
[ -3 1 0 ]
[ -4 0 1 ]
U:
[ 1 -1/6  3/8  ]
[ 0  1   60/17 ]
[ 0  0    1    ]

Next steps

In further entries in this series I'll demonstrate calling this library from several modern, widely-used, high-level languages. The ones I'm planning to target include:

  • Java
  • C#
  • Python
  • (maybe) Swift

But who knows how deep this rabbit hole will go?