Tuesday, October 14, 2025

Going Native - Swift

"Your second-hand bookseller is second to none in the worth of the treasures he dispenses."

Leigh Hunt

Coming home to roost

This is the completion of a series on calling native code from high-level languages. Here is a description of the native library I'm calling in this series.

Apple has used several languages for its operating system and devices, most notably Objective-C and Swift. But I read a few years ago that Swift had found some adoption in data analysis and Big Data applications because of its expressiveness and streaming features. Swift has been released in open source, so there are implementations for Linux and Windows in addition to MacOS. I did an Advent of Code in Swift one year, and enjoyed it. To wrap up this project of calling native code from high-level languages I decided to give Swift a try.

Getting Started

The interface for calling native code from Swift has changed recently. The mechanism is the Swift Package Manager, but the changes have meant some older references are out of date. One example that gave me hope, even though it didn't work was this blog post: Wrapping C Libraries in Swift.

The example that got me going was directly from the Swift Documentation on the Swift Package Manager, particularly using system libraries to call native code.

As an Apple-original language, I wasn't sure how it would translate to Windows. I was fairly confident in its applicability to Linux, though, so that's where I started. That meant writing a command line application, instead of an app: those are Mac-only.


$ mkdir SwiftRMatrix
$ cd SwiftRMatrix
$ swift package init --type executable
$ tree .
.
├── Package.swift
└── Sources
    └── SwiftRMatrix
        └── SwiftRMatrix.swift

2 directories, 2 files

These commands set up a group of files and directories, the most important of which are Package.swift and Sources/SwiftRMatrix/SwiftRMatrix.swift. The latter is the entrypoint to the application, and the former is the directions for how to build the project. This is all that is needed to run "Hello, world!": you can do swift run at this point and see the message printed to the console.

Linking to native code is a matter of writing new modules and setting up dependencies among the modules in the project.


$ mkdir Sources/CRashunal
$ touch Sources/CRashunal/rashunal.h
$ touch Sources/CRashunal/module.modulemap

rashunal.h:


#import <rashunal.h>

module.modulemap:


module CRashunal [system] {
    umbrella header "rashunal.h"
    link "rashunal"
}

rashunal.h, which is distinct from the rashunal.h I wrote for the Rashunal project, is simply a transitive import to the native code, bringing all the declarations in the original rashunal.h into the Swift project. module.modulemap emphasizes this by saying that rashunal.h is an umbrella header, and that the code will link the rashunal library. At this point, CRashunal (the Swift project) can be imported into Swift code and used.

Package.swift:


// swift-tools-version: 6.2
// The swift-tools-version declares the minimum version of Swift required to build this package.

import PackageDescription

let package = Package(
    name: "SwiftRMatrix",
    dependencies: [],
    targets: [
        // Targets are the basic building blocks of a package, defining a module or a test suite.
        // Targets can depend on other targets in this package and products from dependencies.
        .systemLibrary(
            name: "CRashunal"
        ),
        .executableTarget(
            name: "SwiftRMatrix",
            dependencies: ["CRashunal"],
            path: "Sources/SwiftRMatrix"
        ),
    ]
)

SwiftRMatrix.swift:


// The Swift Programming Language
// https://docs.swift.org/swift-book
import Foundation

@main
struct SwiftRMatrix {
    public func run() throws {
        let r: UnsafeMutablePointer = n_Rashunal(numericCast(1), numericCast(2))
        print("{\(r.pointee.numerator),\(r.pointee.denominator)}")
    }
}

I like that Swift distinguishes between mutable and immutable pointers (UnsafeMutablePointer and UnsafePointer), and uses generics to indicate what the pointer is to. Swift also has an OpaquePointer when the fields of a struct are not imported, like an RMatrix. I'll come back to that later. The pointee field to access the fields of the struct is an additional bonus.

ChatGPT pointed me to memory safety early on, so I learned quickly how to access the standard library on the different platforms. Swift recognizes C-like compiler directives, so accessing it was a simple matter of importing the right native libraries. For Windows, it's a part of the platform, so no special import is needed.


#if os(Linux)
import Glibc
#elseif os(Windows)

#elseif os(macOS)
import Darwin
#else
#error("Unsupported platform")
#endif
...
let r: UnsafeMutablePointer = n_Rashuna(numericCast(1), numericCast(2))
print("{\(r.pointee.numerator),\(r.pointee.denominator)}")
free(r)

And that's it, for code. The devil, of course, is in the compiling and linking.

A chain is only as strong as its weakest link

Swift Package Manager uses several sources to find libraries, but none of them seemed to match my particular use case. The closest was to make use of pkg-config. The more I read about it, the more it seemed to be an industry standard, and that Rashunal and RMatrix would benefit by taking advantage of it. So I broke my rule that I established earlier and decided to enhance the libraries.

Fortunately, it wasn't too painful. Telling Rashunal to write to pkg-config was only a few lines added to rashunal/CMakeLists.txt:


+set(PACKAGE_NAME rashunal)
+set(PACKAGE_VERSION 0.0.1)
+set(PACKAGE_DESC "Rational arithmetic library")
+set(PKGCONFIG_INSTALL_DIR "${CMAKE_INSTALL_LIBDIR}/pkgconfig")
+
+configure_file(
+  ${CMAKE_CURRENT_SOURCE_DIR}/rashunal.pc.in
+  ${CMAKE_CURRENT_BINARY_DIR}/${PACKAGE_NAME}.pc
+  @ONLY
+)
+
 add_library(rashunal SHARED src/rashunal.c src/rashunal_util.c)
...
+install(
+  FILES ${CMAKE_CURRENT_BINARY_DIR}/rashunalConfig.cmake
+  DESTINATION lib/cmake/rashunal
+)
+
+install(
+  FILES ${CMAKE_CURRENT_BINARY_DIR}/${PACKAGE_NAME}.pc
+  DESTINATION ${PKGCONFIG_INSTALL_DIR}
 )

The first block is toward the top of CMakeLists.txt, and the second is toward the bottom.

The configure_file directive needs a template for the pc file that will be written. The template has placeholders set of by '@' that will be filled in during the build process.

rashunal.pc.in:


prefix=@CMAKE_INSTALL_PREFIX@
exec_prefix=${prefix}
libdir=${exec_prefix}/@CMAKE_INSTALL_LIBDIR@
includedir=${prefix}/@CMAKE_INSTALL_INCLUDEDIR@

Name: @PACKAGE_NAME@
Description: @PACKAGE_DESC@
Version: @PACKAGE_VERSION@
Libs: -L${libdir} -l@PACKAGE_NAME@
Cflags: -I${includedir}

During installation the newly-written rashunal.pc file will be written to a platform-standard location on disk.

After making those changes, building, compiling, and installing, pkg-config was able to tell me something about the Rashunal library:


$ rm -rf build
$ mkdir build
$ cd build
$ cmake ..
$ make && sudo cmake --install .
$ ls /usr/local/lib/pkgconfig
rashunal.pc
$ cat /usr/local/lib/pkgconfig/rashunal.pc
prefix=/usr/local
exec_prefix=${prefix}
libdir=${exec_prefix}/lib
includedir=${prefix}/include

Name: rashunal
Description: Rational arithmetic library
Version: 0.0.1
Libs: -L${libdir} -lrashunal
Cflags: -I${includedir}
$ pkg-config --cflags rashunal
-I/usr/local/include
$ pkg-config --libs rashunal
-L/usr/local/lib -lrashunal

Notice the new command to install the project: apparently this is the more modern and more approved way to do it nowadays. The bash output means that the declarations of the Rashunal library can be found at /usr/local/include and the binaries at /usr/local lib.

Now the Swift Package Manager can be told just to consult pkg-config for the header and binary location of any system libraries it's attempting to build. It's not necessary, but the examples I saw recommended adding some suggestions for how to install Rashunal if it's not present. I haven't looked into what it takes to package a library for apt or brew, but I'm pretty sure this is how they are consumed:

Package.swift:


.systemLibrary(
    name: "CRashunal",
    pkgConfig: "rashunal",
    providers: [
        .apt(["rashunal"]),
        .brew(["rashunal"]),
    ],
)

Then the Swift project could be built and run:


$ swift build
$ swift run SwiftRMatrix
{1,2}

And rinse and repeat for RMatrix. There is nothing new in building the RMatrix pkg-config files or linking to it from Swift, except for the dependency on Rashunal in the template for RMatrix:

rmatrix.pc.in


prefix=@CMAKE_INSTALL_PREFIX@
exec_prefix=${prefix}
libdir=${exec_prefix}/@CMAKE_INSTALL_LIBDIR@
includedir=${prefix}/@CMAKE_INSTALL_INCLUDEDIR@

Name: @PACKAGE_NAME@
Description: @PACKAGE_DESC@
Version: @PACKAGE_VERSION@
Requires: rashunal
Libs: -L${libdir} -l@PACKAGE_NAME@
Cflags: -I${includedir}

I started to look into removing that hardcoded dependency and getting it from the link libraries in CMakeLists.txt, but that quickly started to grow big and nasty, so I abandoned it. ChatGPT assured me that was common, especially for small projects.

Crossing the operating system ocean

Trying to do this on MacOS, I ran into my old nemesis SIP. Fortunately, the solution here was similar to the solution I followed there. The Swift command at /usr/bin/swift was protected by SIP, but the executable generated by the swift build command wasn't:


% swift build -Xlinker -rpath -Xlinker /usr/local/lib
% swift run .build/debug/SwiftRMatrix
{1,2}

What is astonishing is that, with one more testy exchange with ChatGPT, I also got it to work on Windows. I still don't understand what was the difference with Linux and MacOS or how this changed things on Windows, but I had to make an additional change to Rashunal's CMakeLists.txt and the cmake command to build RMatrix:

rashunal/CMakeLists.txt


if (WIN32)
  set_target_properties(rashunal PROPERTIES
    ARCHIVE_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/bin"
    RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/bin"
  )
endif()

>cmake .. -G "NMake Makefiles" ^
More? -DCMAKE_BUILD_TYPE=Release ^
More? -DCMAKE_INSTALL_PREFIX=C:/Users/john.todd/local/rmatrix ^
More? -DCMAKE_PREFIX_PATH=C:/Users/john.todd/local/rashunal ^
More? -DCMAKE_C_FLAGS_RELEASE="/MD /O2 /DNDEBUG"
>nmake
>nmake install

Then the Swift application could be built and run from the command line, albeit with a few additional linker switches. This also needs to be done from a Powershell or DOS window with Admin rights because, even though it only changes the local project directory, it seems to write to a protected directory.


> swift build `
>>   -Xcc -IC:/Users/john.todd/local/rashunal/include `
>>   -Xcc -IC:/Users/john.todd/local/rmatrix/include `
>>   -Xlinker /LIBPATH:C:/Users/john.todd/local/rashunal/lib `
>>   -Xlinker /LIBPATH:C:/Users/john.todd/local/rmatrix/lib `
>>   -Xlinker /DEFAULTLIB:rashunal.lib `
>>   -Xlinker /DEFAULTLIB:rmatrix.lib `
>>   -Xlinker /DEFAULTLIB:ucrt.lib
> ./.build/debug/SwiftRMatrix.exe
{1,2}

Cleaning up the guano

My last task was to abstract the native calls away from the main application. To do this I wrote a Models module that wrapped the native Rashunal, RMatrix, and Gauss Factorization structs.

Sources/Model/Model.swift


public class Rashunal: CustomStringConvertible {
    var _rashunal: UnsafePointer

    public init(_ numerator: Int, _ denominator: Int = 1) {
        _rashunal = UnsafePointer(n_Rashunal(numericCast(numerator), numericCast(denominator)))
    }

    public init(_ data: [Int]) {
        _rashunal = UnsafePointer(n_Rashunal(numericCast(data[0]), data.count > 1 ? numericCast(data[1]) : 1))
    }

    public var numerator: Int { Int(_rashunal.pointee.numerator) }

    public var denominator: Int { Int(_rashunal.pointee.denominator) }

    public var description: String {
        return "{\(numerator),\(denominator)}"
    }

    deinit {
        free(UnsafeMutablePointer(mutating: _rashunal))
    }
}

What gets returned from the native n_Rashunal call is a Swift UnsafeMutablePointer. I wanted them to be immutable wherever possible, so I cast it to an UnsafePointer in both the constructors. Swift makes property definition and string representations easy and natural. The deinit method calls the native standard library's free method to release the native memory allocated by Rashunal. This makes cleanup and memory hygiene easy.

Sources/Model/Model.swift


public class RMatrix: CustomStringConvertible {
    var _rmatrix: OpaquePointer

    private init(_ rmatrix: OpaquePointer) {
        _rmatrix = rmatrix
    }

    public init(_ data: [[[Int]]]) {
        let height = data.count
        let width = data.first!.count

        let rashunals = data.flatMap {
            row in row.map {
                cell in n_Rashunal(numericCast(cell[0]), cell.count > 1 ? numericCast(cell[1]) : 1)
            }
        }
        let ptrArray = UnsafeMutablePointer?>.allocate(capacity: rashunals.count)
        for i in 0.. = RMatrix_get(_rmatrix, i, j)
                let rep = "{\(cellPtr.pointee.numerator),\(cellPtr.pointee.denominator)}"
                free(UnsafeMutablePointer(mutating: cellPtr))
                return rep
            }.joined(separator: " ") + " ]"
        }.joined(separator: "\n")
    }

    deinit {
        free_RMatrix(_rmatrix)
    }
}

Unsurprisingly, RMatrix was the hardest of these to get right. The private constructor is used in the factor method as a convenience method to initialize a Swift RMatrix. The other constructor is used to initialize a matrix from the familiar 3D array of Ints. I get the height and width from the first two dimensions of the input array, then use the n_Rashunal method to construct a list of native Rashunal structs as UnsafeMutablePointer<CRashunal.Rashunal>s. As before, new_RMatrix expects an array of pointers to structs, but the rashunals array is in managed memory, not native memory. So I allocate and fill an array of pointers to the Rashunal structs in native memory. ChatGPT suggested I add the defer block in case new_RMatrix abends for any reason. Because the RMatrix struct is declared but not defined in rmatrix.h, what is automatically returned is an OpaquePointer, which is just fine with me.

Properties defer to the encapsulated _rmatrix pointer, and the string description method makes full use of Swift's stream processing capabilities. deinit calls the RMatrix library's free_RMatrix method.

After all that, factoring a matrix and the GaussFactorization struct are pretty routine.

Sources/Model/Model.swift


public struct GaussFactorization {
    public var PInverse: RMatrix
    public var Lower: RMatrix
    public var Diagonal: RMatrix
    public var Upper: RMatrix

    public init(PInverse: RMatrix, Lower: RMatrix, Diagonal: RMatrix, Upper: RMatrix) {
        self.PInverse = PInverse
        self.Lower = Lower
        self.Diagonal = Diagonal
        self.Upper = Upper
    }
}

public class RMatrix: CustomStringConvertible {
...
    public func factor() -> GaussFactorization {
        let gf = RMatrix_gelim(_rmatrix)!
        let sgf = GaussFactorization(
            PInverse: RMatrix(gf.pointee.pi),
            Lower: RMatrix(gf.pointee.l),
            Diagonal: RMatrix(gf.pointee.d),
            Upper: RMatrix(gf.pointee.u)
        )
        free(gf)
        return sgf
    }
}

Calling the native method RMatrix_gelim returns a newly-allocated struct pointing to four newly-allocated matrices. The matrices are passed to the RMatrix constructor, so that the class takes responsibility for managing their memory. The native struct itself is freed by the RMatrix factor method before returning the Swift struct.

The driver class has no import of native code, and all the allocations look just like Swift objects.


import ArgumentParser
import Foundation
import Model

enum SwiftRMatrixError: Error {
    case runtimeError(String)
}

@main
struct SwiftRMatrix: ParsableCommand {
    @Option(help: "Specify the input file")
    public var inputFile: String

    public func run() throws {
        let url = URL(fileURLWithPath: inputFile)
        var inputText = ""
        do {
            inputText = try String(contentsOf: url, encoding: .utf8)
        } catch {
            throw SwiftRMatrixError.runtimeError("Error reading file [\(inputFile)]")
        }
        let data = inputText
            .split(whereSeparator: \.isNewline)
            .map { $0.trimmingCharacters(in: .whitespaces) }
            .map { line in line.split(whereSeparator: { $0.isWhitespace })
            .map { token in token.split(separator: "/").map { Int($0)! } }
        }
        let m = Model.RMatrix(data)
        print("Input matrix:")
        print(m)

        let factor = m.factor()
        print("Factors into:")
        print("PInverse:")
        print(factor.PInverse)

        print("Lower:")
        print(factor.Lower)

        print("Diagonal:")
        print(factor.Diagonal)

        print("Upper:")
        print(factor.Upper)
    }
}
$ swift run SwiftRMatrix --input-file /home/john/workspace/rmatrix/driver/example.txt
[1/1] Planning build
Building for debugging...
[11/11] Linking SwiftRMatrix
Build of product 'SwiftRMatrix' complete! (1.17s)
Input matrix:
[ {-2,1} {1,3} {-3,4} ]
[ {6,1} {-1,1} {8,1} ]
[ {8,1} {3,2} {-7,1} ]
Factors into:
PInverse:
[ {1,1} {0,1} {0,1} ]
[ {0,1} {0,1} {1,1} ]
[ {0,1} {1,1} {0,1} ]
Lower:
[ {1,1} {0,1} {0,1} ]
[ {-3,1} {1,1} {0,1} ]
[ {-4,1} {0,1} {1,1} ]
Diagonal:
[ {-2,1} {0,1} {0,1} ]
[ {0,1} {17,6} {0,1} ]
[ {0,1} {0,1} {23,4} ]
Upper:
[ {1,1} {-1,6} {3,8} ]
[ {0,1} {1,1} {-60,17} ]
[ {0,1} {0,1} {1,1} ]

Reflection

Wow, that turned out a lot better than I expected. I thought this would be possible on Linux and MacOS. To be able to get it to work on Windows too was a pleasant surprise. I really like the Swift language: it is expressive and concise and makes really good use of streaming approaches. I hope I get to use it to make money sometime.

Code repository

https://github.com/proftodd/GoingNative/tree/main/SwiftRMatrix

Monday, October 6, 2025

Going Native - Python

Photo by <a href='https://freeimages.com/photographer/rolve-45406'>rolve</a> on <a href='https://freeimages.com/'>Freeimages.com</a>

"If you're not stubborn, you'll give up on experiments too soon. And if you're not flexible, you'll pound your head against the wall and you won't see a different solution to a problem you're trying to solve."

Jeff Bezos

This is the continuation of a series on calling native code from high-level languages. Here is a description of the native library I'm calling in this series.

When I got to Python I thought things would get easier. After all, Python was written to be a quick and easy wrapper around C. Alas, no, in a pattern that was becoming familiar it was fairly easy to wrap the native code and call it from the Python code, but getting it to find the native libraries at runtime was another difficult challenge.

There are two traditional ways to call native code from Python. The first is `ctypes`, and the other is writing a Python extension in C using `Cython`. ctypes is generally easier for quick and easy calling of a native library, while Cython is better for truly getting the advantages of calling native code (primarily optimization of execution speed). There are some other approaches, but these are the ones I tried for this post.

Calling the native libraries via ctypes

ctypes is part of the Python standard library, so no steps were necessary to import it into the project.

Loading the libraries was a simple matter of calling a ctypes function and mapping the argument and return types.


import ctypes

class RASHUNAL(ctypes.Structure):
    _fields_ = [("numerator", ctypes.c_int), ("denominator", ctypes.c_int)]

class RMATRIX(ctypes.Structure):
    pass

class GAUSS_FACTORIZATION(ctypes.Structure):
    _fields_ = [
        ("P_INVERSE", ctypes.POINTER(RMATRIX)),
        ("LOWER", ctypes.POINTER(RMATRIX)),
        ("DIAGONAL", ctypes.POINTER(RMATRIX)),
        ("UPPER", ctypes.POINTER(RMATRIX))
    ]

_rashunal_lib = ctypes.CDLL('librashunal.so')
_rashunal_lib.n_Rashunal.argtypes = (ctypes.c_int, ctypes.c_int)
_rashunal_lib.n_Rashunal.restype = ctypes.POINTER(RASHUNAL)

_rmatrix_lib = ctypes.CDLL('librmatrix.so')
_rmatrix_lib.new_RMatrix.argtypes = (ctypes.c_size_t, ctypes.c_size_t, ctypes.POINTER(ctypes.POINTER(RASHUNAL)))
_rmatrix_lib.new_RMatrix.restype = ctypes.POINTER(RMATRIX)

_rmatrix_lib.free_RMatrix.argtypes = (ctypes.POINTER(RMATRIX),)

_rmatrix_lib.RMatrix_gelim.argtypes = (ctypes.POINTER(RMATRIX),)
_rmatrix_lib.RMatrix_gelim.restype = ctypes.POINTER(GAUSS_FACTORIZATION)

_rmatrix_lib.RMatrix_height.argtypes = (ctypes.POINTER(RMATRIX),)
_rmatrix_lib.RMatrix_width.argtypes = (ctypes.POINTER(RMATRIX),)

_rmatrix_lib.RMatrix_get.argtypes = (ctypes.POINTER(RMATRIX), ctypes.c_size_t, ctypes.c_size_t)
_rmatrix_lib.RMatrix_get.restype = ctypes.POINTER(RASHUNAL)

Custom types are declared as subclasses of the ctypes.Structure class. The RMatrix struct is declared but not given a body in the RMatrix library, so I modeled that as a Python class that also extends ctypes.Structure but has no body. Pointer types are modeled as `ctypes.POINTER` objects with an argument of the type or struct the pointer is for.

Note that if a function has a single argument, the field is still argtypes (plural). Also, the argument is a Python tuple, so if it only has one element then it needs a trailing comma. That took me a while to figure out!

Once the functions are declared, they are called just like regular Python functions.


def allocate_c_rmatrix(m):
    height = m.height
    width = m.width
    element_count = height * width

    c_rashunal_pointers = (ctypes.POINTER(RASHUNAL) * element_count)()
    for i in range(element_count):
        pel = m.data[i]
        r = _rashunal_lib.n_Rashunal(pel.numerator, pel.denominator)
        c_rashunal_pointers[i] = ctypes.cast(r, ctypes.POINTER(RASHUNAL))
    c_rmatrix = _rmatrix_lib.new_RMatrix(height, width, c_rashunal_pointers)
    for i in range(element_count):
        cel = c_rashunal_pointers[i]
        _std_lib.free(cel)
    _std_lib.free(c_rashunal_pointers)
    return c_rmatrix

def allocate_python_rmatrix(m):
    height = _rmatrix_lib.RMatrix_height(m)
    width = _rmatrix_lib.RMatrix_width(m)
    p_rashunals = []
    for i in range(1, height + 1):
        for j in range(1, width + 1):
            c_rashunal = _rmatrix_lib.RMatrix_get(m, i, j)
            p_rashunals.append(RMatrix.PRashunal((c_rashunal.contents.numerator, c_rashunal.contents.denominator)))
            _std_lib.free(ctypes.cast(c_rashunal, ctypes.c_void_p))
    return RMatrix.PRMatrix(height, width, p_rashunals)

def factor(m):
    crm = allocate_c_rmatrix(m)
    gf = _rmatrix_lib.RMatrix_gelim(crm)

    p_inverse = allocate_python_rmatrix(gf.contents.P_INVERSE)
    lower = allocate_python_rmatrix(gf.contents.LOWER)
    diagonal = allocate_python_rmatrix(gf.contents.DIAGONAL)
    upper = allocate_python_rmatrix(gf.contents.UPPER)

    _rmatrix_lib.free_RMatrix(gf.contents.P_INVERSE)
    _rmatrix_lib.free_RMatrix(gf.contents.LOWER)
    _rmatrix_lib.free_RMatrix(gf.contents.DIAGONAL)
    _rmatrix_lib.free_RMatrix(gf.contents.UPPER)
    _std_lib.free(ctypes.cast(gf, ctypes.c_void_p))

    return RMatrix.PGaussFactorization(p_inverse, lower, diagonal, upper)

ctypes and objects obtained from it have some utility methods that come in handy. Arrays are declared by calling the pointer type times the length of the array as a function: c_rashunal_pointers = (ctypes.POINTER(RASHUNAL) * element_count)(). Pointers can be cast (c_rashunal_pointers[i] = ctypes.cast(r, ctypes.POINTER(RASHUNAL))), and dereferenced (upper = allocate_python_rmatrix(gf.contents.UPPER)).

As in other languages, the structs allocated by the native library and returned to the caller have to be disposed of properly to prevent memory leaks.

So that seems pretty straightforward. I've written this as if it were to be run on a Linux machine. Trying to move to other platforms introduced the complexity.

Making it cross-platform

I started in my Ubuntu WSL shell this time, so note the names of the files in the ctypes.CDLL calls. Very Linux specific. The first task was to make that cross-platform.


def load_library(lib_name):
    if sys.platform.startswith("win"):
        filename = f"{lib_name}.dll"
    elif sys.platform.startswith("darwin"):
        filename = f"lib{lib_name}.dylib"
    else:
        filename = f"lib{lib_name}.so"

    try:
        return ctypes.CDLL(filename)
    except OSError as e:
        raise OSError(f"Could not load library '{filename}'")

_rashunal_lib = load_library('rashunal')
_rashunal_lib.n_Rashunal.argtypes = (ctypes.c_int, ctypes.c_int)
_rashunal_lib.n_Rashunal.restype = ctypes.POINTER(RASHUNAL)

Also, I needed a different approach to load the standard libraries. As discussed in the C# post, the standard libraries have different names on the three operating systems, so a simple root name based approach wouldn't work:


def load_standard_library():
    if sys.platform.startswith("win"):
        return ctypes.CDLL('ucrtbase.dll')
    elif sys.platform.startswith("darwin"):
        return ctypes.CDLL('libSystem.dylib')
    else:
        return ctypes.CDLL('libc.so.6')

_std_lib = load_standard_library()
_std_lib.free.argtypes = (ctypes.c_void_p,)
_std_lib.malloc.argtypes = (ctypes.c_size_t,)

Not too bad. That worked fine on Linux, and also on Mac OS if I put /usr/local/lib in DYLD_LIBRARY_PATH. However, Windows was the standout this time. Turns out since Windows 10 "ctypes.CDLL and the system loader sometimes ignore PATH for dependent DLLs due to 'SafeDllSearchMode' and other loader rules." Thanks, Microsoft.

You can add to the Python interpreter's search path by making os.add_dll_directory calls. To keep the code flexible, I went back to the environment variable trick to add the required locations.


def get_dll_dirs_from_env(env_var="RMATRIX_LIB_DIRS"):
    val = os.environ.get(env_var, "")
    if not val:
        return []
    return val.split(os.pathsep)

def load_library(lib_name):
    if sys.platform.startswith("win"):
        dll_dirs = get_dll_dirs_from_env()
        for d in dll_dirs:
            if not os.path.isdir(d):
                continue
            os.add_dll_directory(d)
        filename = f"{lib_name}.dll"
    elif sys.platform.startswith("darwin"):
        filename = f"lib{lib_name}.dylib"
    else:
        filename = f"lib{lib_name}.so"

    try:
        return ctypes.CDLL(filename)
    except OSError as e:
        raise OSError(f"Could not load library '{filename}'")
> $env:RMATRIX_LIB_DIRS="C:\Users\john.todd\local\rashunal\lib;C:\Users\john.todd\local\rmatrix\lib"
> python main.py /Users/john.todd/source/repos/rmatrix/driver/example.txt
using data from file /Users/john.todd/source/repos/rmatrix/driver/example.txt
Input matrix:
[ {-2,1} {1,3} {-3,4} ]
[ {6,1} {-1,1} {8,1} ]
[ {8,1} {3,2} {-7,1} ]

PInverse:
[ {1,1} {0,1} {0,1} ]
[ {0,1} {0,1} {1,1} ]
[ {0,1} {1,1} {0,1} ]

Lower:
[ {1,1} {0,1} {0,1} ]
[ {-3,1} {1,1} {0,1} ]
[ {-4,1} {0,1} {1,1} ]

Diagonal:
[ {-2,1} {0,1} {0,1} ]
[ {0,1} {17,6} {0,1} ]
[ {0,1} {0,1} {23,4} ]

Upper:
[ {1,1} {-1,6} {3,8} ]
[ {0,1} {1,1} {-60,17} ]
[ {0,1} {0,1} {1,1} ]

And voila, works on all three platforms.

Calling the native libraries via Cython

Cython is a weird dialect? sublanguage? independent language? It looks most like Python, but includes some elements of C. Hence the name, a combination of C and Python. The Cython documentation and examples in the tutorials discussed mainly wrapping C standard library functions or an implementation of Queues. I couldn't find a good example of wrapping a custom library, or two custom libraries with dependencies on each other like my model libraries. So once again, ChatGPT and I plunged in.

For this experiment I worked in my Ubuntu WSL shell. I wound up with two Python modules that can be separately compiled and packaged and installed via pip.

The easier one: packaging Rashunal

Cython requires a declarations file (pxd) and an implementation file (pyx). The convention seems to be to name the declarations file as the name of the library with a 'c' prepended. The pyx file can be named just the name of the library.


# crashunal.pxd
cdef extern from "rashunal.h":
    ctypedef struct Rashunal:
        int numerator
        int denominator
    
    Rashunal *n_Rashunal(int numerator, int denominator)

# rashunal.pyx
from libc.stdlib cimport free
cimport crashunal

cdef class Rashunal:
    cdef crashunal.Rashunal *_c_rashunal

    def __cinit__(self, numerator, denominator):
        self._c_rashunal = crashunal.n_Rashunal(numerator, denominator)
        if self._c_rashunal is NULL:
            raise MemoryError()
    
    def __dealloc__(self):
        if self._c_rashunal is not NULL:
            crashunal.free(self._c_rashunal)
            self._c_rashunal = NULL
    
    def __str__(self):
        return f"{{{self._c_rashunal.numerator},{self._c_rashunal.denominator}}}"
    
    @property
    def numerator(self):
        return self._c_rashunal.numerator
    
    @property
    def denominator(self):
        return self._c_rashunal.denominator

In Cython, things that begin with a "c" are related to the native library and the C code. So "cimport" means "import something from the C library", "cdef" means "declare this as something that will be used by the C code", and "ctypedef" means "this is a type that will be coming from C". Things without the "c" prefix are meant to be used by the Python code. (There is also a "cp" prefix, meaning something can be used by both the C and Python code. I'm not sure how that would be useful.)

crashunal.pxd declares the Rashunal struct and the n_Rashunal method. It says their definitions can be obtained from the rashunal.h header file, wherever that may be. (I'll come back to that later.)

rashunal.pyx declares an ordinary Python class, Rashunal that wraps a crashunal.Rashunal struct and holds a reference to it. Rashunal's constructor accepts a numerator and a denominator, passing them to the native n_Rashunal method, and holding on to the struct that is returned. It also declares a __dealloc__ method that frees the struct when the object goes out of scope, and a couple of convenience properties for easy access to the fields of the struct.

Cython modules are built using a setup.py file:


import os
from setuptools import setup, Extension
from Cython.Build import cythonize

extensions = [
    Extension(
        "rashunal._rashunal",
        ["rashunal/rashunal.pyx"],
        libraries=["rashunal"],
        include_dirs=[os.environ.get("RASHUNAL_INCLUDE", "/usr/local/include")],
        library_dirs=[os.environ.get("RASHUNAL_LIB", "/usr/local/lib")]
    )
]

setup(
    name="rashunal",
    version="0.1.0",
    packages=["rashunal"],
    ext_modules=cythonize(
        extensions,
        language_level="3",
        include_path=["rashunal"]
    ),
)

The extensions is the list of all extensions that are to be built. More than one can be built by a single setup.py file, and I did that for a while with Rashunal and RMatrix, but backed off to one at a time in order to make the process and packages more granular. The extension is named rashunal._rashunal to reflect finding the package and paralleling the directory structure. The underscore is to hide the C library and prevent import confusion when bringing it into a client. Most of the flags here are related to finding the C libraries: libraries is the list of libraries to link to, include_dirs is where to find their header files (if they're not part of the project), and library_dirs is where to find their compiled binaries. If you're building at the command line these can be supplemented by flags, but for reasons I'll discuss later I had to complete them with environment variables and default values.

The setup method describes how to actually build the extensions. It needs the name(s) of the package(s) to build and the list of extensions to include. The include_path here is where to find the pxd and pyx files


# __init__.py
from ._rashunal import Rashunal

__init__.py is required, but can be empty. I added this import to both obscure the C library and simplify the import. If __init__.py were empty the build would work and the code could be imported, but it would look pretty ugly: import rashunal._rashunal.Rashunal, or something like that.

Here's the directory setup:


$ tree .
.
├── rashunal
│   ├── __init__.py
│   ├── crashunal.pxd
│   └── rashunal.pyx
└── setup.py

1 directory, 4 files

Cython and its related tools are not part of the Python standard library, so they have to be installed.


$ pip install Cython, setuptools
$ python setup.py build_ext -i

This works, and the output can be imported into client code and be used. I wanted to take the further step and make this into a pip package, however. That required a couple more files.


# pyproject.toml
[build-system]
requires = ["setuptools>=61.0", "wheel", "Cython"]
build-backend = "setuptools.build_meta"

[project]
name = "rashunal"
version = "0.1.0"
description = "Python bindings for the Rashunal C library"
authors = [{ name = "John Todd" }]
readme = "README.md"
requires-python = ">=3.8"

# MANIFEST.in
include rashunal/*.pxd
include rashunal/*.pyx
$ tree .
.
├── MANIFEST.in
├── README.md
├── pyproject.toml
├── rashunal
│   ├── __init__.py
│   ├── crashunal.pxd
│   └── rashunal.pyx
└── setup.py

1 directory, 7 files

pyproject.toml gives instructions on how the wheel file is to be built and a description of the project, including any dependencies or runtime requirements. MANIFEST.in says that the pxd and pyx file should be included in the wheel. The build tool will need those in order to compile the Cython code later on.

Now the package can be built at the command line, but include_dirs and library_dirs cannot be added at this point. This is why I had to include environment variables in setup.py to find the C header and library files. I also didn't want this experimental project permanently installed in my Python environment, so I created a virtual environment to test them.

The build tool also has to be installed before it can be used.


$ python3 -m pip install build
$ python3 -m build
$ python3 -m venv venv-test
$ source venv-test/bin/activate
(venv-test) $ pip install --upgrade pip wheel
(venv-test) $ pip install dist/rashunal-0.1.0-cp310-cp310-linux_x86_64.whl
(venv-test) $ cd ~
(venv-test) $ python
>>> from rashunal import Rashunal
>>> r = Rashunal(1, 2)
>>> print(r)
{1,2}

Note when starting the Python REPL and importing the code I had to be in a different directory than the project directory so the interpreter didn't confuse the installed pip wheel with the source code.

The harder one: packaging RMatrix

Things got really hairy when I tried to package RMatrix because of its dependency on Rashunal. I imagined that Rashunal and RMatrix would be packaged separately, since a library of rational numbers could theoretically be used for other purposes than matrices and linear algebra.

The __init__.py, pxd and pyx files were fairly straightforward and comparable to Rashunal's:


# __init__.py
from ._rmatrix import RMatrix

# crmatrix.pxd
cimport crashunal

cdef extern from "rmatrix.h":
    ctypedef struct RMatrix:
        pass
    
    RMatrix *new_RMatrix(size_t height, size_t width, crashunal.Rashunal **data)
    void free_RMatrix(RMatrix *m)
    size_t RMatrix_height(const RMatrix *m)
    size_t RMatrix_width(const RMatrix *m)
    Gauss_Factorization *RMatrix_gelim(const RMatrix *m)
    crashunal.Rashunal *RMatrix_get(const RMatrix *m, size_t row, size_t col)

    ctypedef struct Gauss_Factorization:
        const RMatrix *pi
        const RMatrix *l
        const RMatrix *d
        const RMatrix *u

# rmatrix.pyx
from libc.stdlib cimport malloc, free
cimport crashunal
cimport crmatrix

cdef class RMatrix:
    cdef crmatrix.RMatrix *_c_rmatrix

    def __cinit__(self, data):
        cdef height = len(data)
        cdef width = len(data[0])
        cdef el_count = height * width
        cdef crashunal.Rashunal **arr =  malloc(el_count * sizeof(crashunal.Rashunal*))
        if arr is NULL:
            raise MemoryError()

        try:
            for i in range(el_count):
                el = data[i // width][i % width]
                num = el[0]
                den = el[1] if len(el) == 2 else 1
                arr[i] = crashunal.n_Rashunal(num, den)
                if arr[i] is NULL:
                    raise MemoryError()
            self._c_rmatrix = crmatrix.new_RMatrix(height, width, arr)
            if self._c_rmatrix is NULL:
                raise MemoryError()
        finally:
            for i in range(el_count):
                if arr[i] is not NULL:
                    crashunal.free(arr[i])
            crashunal.free(arr)
    
    def __dealloc__(self):
        if self._c_rmatrix is not NULL:
            crmatrix.free_RMatrix(self._c_rmatrix)
            self._c_rmatrix = NULL

    @property
    def height(self):
        return crmatrix.RMatrix_height(self._c_rmatrix)

    @property
    def width(self):
        return crmatrix.RMatrix_width(self._c_rmatrix)
    
    def factor(self):
        cdef crmatrix.Gauss_Factorization *f
        f = crmatrix.RMatrix_gelim(self._c_rmatrix)
        try:
            result = (
                _crmatrix_to_2d_array(f.pi),
                _crmatrix_to_2d_array(f.l),
                _crmatrix_to_2d_array(f.d),
                _crmatrix_to_2d_array(f.u)
            )
        finally:
            if f.pi != NULL: crmatrix.free_RMatrix(f.pi)
            if f.l  != NULL: crmatrix.free_RMatrix(f.l)
            if f.d  != NULL: crmatrix.free_RMatrix(f.d)
            if f.u  != NULL: crmatrix.free_RMatrix(f.u)
            crashunal.free(f)
        return result

cdef _crmatrix_to_2d_array(const crmatrix.RMatrix *crm):
    cdef height = crmatrix.RMatrix_height(crm)
    cdef width = crmatrix.RMatrix_width(crm)
    cdef result = []
    cdef const crashunal.Rashunal *el
    for i in range(height):
        row = []
        for j in range(width):
            el = crmatrix.RMatrix_get(crm, i + 1, j + 1)
            row.append((el.numerator, el.denominator))
            crashunal.free(el)
        result.append(row)
    return result

The type definitions mirror what is in the native libraries. For the implementation I backed off to passing the RMatrix constructor a 3D array of integers rather than a custom object for maximum flexibility when packaged for pip. By now the allocation and deallocation code should be understandable, even if the syntax varies from implementation to implementation. The pointer casts when deallocating memory are necessary to avoid C compiler warnings.


# setup.py
import os
import sys
from setuptools import setup, Extension
from Cython.Build import cythonize

extensions = [
    Extension(
        "rmatrix._rmatrix",
        ["rmatrix/rmatrix.pyx"],
        include_dirs=[os.environ.get("RMATRIX_INCLUDE", "/usr/local/include")],
        libraries=["rmatrix"],
        library_dirs=[os.environ.get("RMATRIX_LIB", "/usr/local/lib")]
    )
]

setup(
    name="rmatrix",
    version="0.1.0",
    packages=["rmatrix"],
    install_requires=["rashunal>=0.1.0"],
    ext_modules=cythonize(
        extensions,
        language_level="3",
        include_path=["rmatrix"]
    )
)

setup.py is very similar to Rashunal's. Notice the install_requires value to setup. That would ordinarily require an include_path reference to Rashunal's pxd and pyx files, but if these were separate projects neither I nor ChatGPT could come up with a way to include them here. Fortunately, we did discover a way to do it in virtual environments.

pyproject.toml and MANIFEST.in were pretty much identical to Rashunal's. The toml file did include a field saying it depends on Rashunal.


# pyproject.toml
[build-system]
requires = ["setuptools>=61.0", "wheel", "Cython"]
build-backend = "setuptools.build_meta"

[project]
name = "rmatrix"
version = "0.1.0"
description = "Python bindings for the RMatrix C library"
authors = [{ name = "John Todd" }]
readme = "README.md"
requires-python = ">=3.8"
dependencies = ["rashunal>=0.1.0"]

# MANIFEST.in
include rmatrix/*.pxd
include rmatrix/*.pyx

Much thrashing ensued as I tried to get RMatrix to compile, mainly with locating the Rashunal library. As I outlined above, the compiler needed to find Rashunal's pxd and pyx files. Assuming these would be packaged separately, I didn't want to refer to the source code, even though it was right next to the rmatrix code in my project directory. Instead, I eventually noticed that the wheel file contained them and they were extracted when it was installed in my virtual environment. The build process works in its own fresh virtual environment, but there was no way to install Rashunal in it before trying to install RMatrix. I could reuse the test virtual environment I already had with Rashunal installed in it, however.


$ cd rmatrix
$ source ~/workspace/venv-test/bin/activate
(venv-test) $ python -m build --no-isolation
(venv-test) $ pip install ~/workspace/GoingNative/cython_rmatrix/rmatrix/dist/rmatrix-0.1.0-cp310-cp310-linux_x86_64.whl
(venv-test) $ cd ~
(venv-test) $ python
>>> from rmatrix import RMatrix
>>> crm = RMatrix([[[1], [2], [3,2]], [[4,3], [5], [6]]])
>>> (p_inverse, lower, diagonal, upper) = crm.factor()
>>> print(lower)

Not sure if that's an acceptable way to do it, but at this point I was just happy it worked. Once again, I needed to change to a different directory when starting the REPL to avoid confusing the installed wheel with the source code.

So there it is, in Linux at least. Some other possibilities ChatGPT mentioned that I didn't look into are:

  • Better packaging of the native libraries using pkg-config. This could probably be done in the CMake code.
  • Packing the generated C file along with or instead of the pxd and pxc files for downstream compiling.
  • Packing the binaries themselves within the wheel so they just work.

I briefly looked into doing this on Windows and MacOS, but ran into insurmountable difficulties. I won't go into the details, but the gist is that virtual environments on Windows and MacOS don't inherit settings from the shell they are invoked from. So there is no way to point to the native headers or binaries to get everything to compile. Both require modifying the source code in setup.py or pyproject.toml in order to set the paths. So if you're trying to write a cross-platform Python library that relies on native libraries, good luck. I can't help you.

Reflection

Wow, that was a journey.

The ctypes approach was definitely simpler, and I got it to work on all three platforms. The Cython approach was much more complicated. I'm not sure how to measure or assess the claims that it is more performant than the ctypes approach. It seems to be better for packaging up the C libraries in a format suitable to Python. Once the pip packages are available clients can use them in a way that Python developers are intimately familiar with. But boy, was it a bear to get working. Still, I feel a sense of accomplishment getting it done, and I do think I learned more about compiling and linking tools, even if I don't fully understand all the syntax and tools.

Code repositories

https://github.com/proftodd/GoingNative/tree/main/python_rmatrix https://github.com/proftodd/GoingNative/tree/main/cython_rmatrix