How to Use the Foreign Function API in Java 22 to Call C Libraries

14 days ago (ifesunmola.com)

I am sort of surprised that there isn't a widely used tool that uses codegen to generate jni bindings sort of like what the jna does but at build time. You could go meta and bundle a builder in a jar that looks for the shared library in a particular place and shells out to build and install the native library if it is missing on the host computer. This would run once pretty similar I think to bundling native code in npm.

I have bundled shared libraries for five or six platforms in a java library that needs to make syscalls. It works but it is a pain if anything ever changes or a new platform needs to be brought up. Checking in binaries always feels icky but is necessary if not all targets can be built on a single machine.

The problem with the new api is that people upgrade java very slowly in most contexts. For an oss library developer, I see very little value add in this feature because I'm still stuck for all of my users who are using an older version of java. If I integrate the new ffi api, now I have to support both it and the jni api.

  • > I am sort of surprised that there isn't a widely used tool that uses codegen to generate jni bindings sort of like what the jna does but at build time

    There are several, including SWIG.

  • Back in 1998 I wrote a code generator to make JNI stubs for LAPACK. It’s the kind of programming that goes that way.

  • There is a library called jnigen [1], mainly used by the libGDX framework [2]. But I don't see it used in many other projects though. Personally I use it to maintain a set of Lua C API bindings for some platforms [3] and it works sort of OK once you manage to somehow set up a workflow for building and testing the binaries.

    > It works but it is a pain if anything ever changes or a new platform needs to be brought up. Checking in binaries always feels icky but is necessary if not all targets can be built on a single machine.

    It is definitely a pain when you cannot test all changes on a single local machine. But I would argue that it is true whenever multiple platforms (or maybe even multiple glibc versions) are involved, regardless of what languages/libraries/tools you use.

    [1] https://libgdx.com/wiki/utils/jnigen [2] https://github.com/libgdx/libgdx [3] https://github.com/gudzpoz/luajava

What I'm missing is a model for building/distributing those C libraries with a java application.

Every ffi example I've found seem to operate on the assumption that you want to invoke syscalls or libc, which (with possibly the exception of like madvise and aioring) Java already mostly has decent facilities to interact with even without native calls.

  • Native libraries are typically packaged inside a jar so that everything works over the existing build and dependency management systems.

    For example, each these jars named "native-$os-$arch.jar" contain a .dll/.so/.dylib: https://repo1.maven.org/maven2/com/aayushatharva/brotli4j/

    JNA will extract the appropriate native library (using os.name and os.arch system properties), save the library to a temp file, then load it.

  • > Every ffi example I've found seem to operate on the assumption that you want to invoke syscalls or libc ... Java already mostly has decent facilities to interact with even without native calls.

    Because you would use ffi to interact with libraries that don't have Java wrappers yet: IE, you're writing the wrapper.

    Using syscalls or libc is a way to write an example against a known library that you're probably familiar with.

  • The recommended distribution model for Java applications is a jlinked runtime image [1], which supports including native libraries in the image.

    [1]: Technically, this is the only distribution model because all Java runtimes as of JDK 9 are created with jlink, including the runtime included in the JDK (which many people use as-is), but I mean a custom runtime packaged with the application.

  • So, other people have already answered this, but this does seem to be a gap where many developers lack some piece of knowledge to chain the whole solution together. You normally package this sort of thing by putting the native library in a jar, extracting it to a tmp file that will be deleted on exit, and opening that dynamic library.

    I’ve met many perfectly reasonable developers who do know all those steps can be done but can’t put them all together - maybe because it just hasn’t clicked that you can store a library in a jar. It feels like something tutorials should cover, but I think falls into the, “surely everyone can work it out?” category.

    • >exract it to a tmp file that will be deleted on exit,

      actually you delete it immediately (after load) on anything that's not windows... even then but it's likely to return false.

      deleteOnExit just stores the path to delete and uses a shutdownHook to actually call delete. Nothing really special about it

    • You would also need to learn about Maven profiles and activation. And for other build tools, you'll be delighted to know they have partial support.

    • > extracting it to a tmp file

      i wonder if there's a way to do this entirely in memory? Because some deployment scenarios might not have disk space at all.

      3 replies →

  • If your app is open source, or you're willing to buy a commercial tool, then you could try Conveyor from my company [1]. It will:

    - Find all the shared libraries in your JARs or configured app inputs (files in your build/source tree)

    - Sniff them to figure out what OS and CPU arch they are for

    - Bundle them into the right package for each platform that it makes, in the right place to be found by System.loadLibrary()

    - Sign them if necessary

    - Delete them from the JARs now they are extracted. Optionally extract them from library JARs, sign them and then put them back if your library refuses to load the shared library from disk instead of unpacking it (most libs don't need this)

    - JLink a bundled JVM for your app for each platform you target, using jdeps to figure out the right set of modules, and combine that with your shared libs.

    When building Debian/Ubuntu packages it will also:

    - Read the .so library dependencies, look up the packages that contain those other shared libraries and add package dependencies on those packages, so "apt install" will do the right thing.

    So that makes it a lot easier to distribute Java apps that use native code.

    [1] https://www.hydraulic.dev/

  • You do it the standard way, package them inside the jar file.

    • Oh, does this actually work?

      I was on the assumption that it was dynamically linking the libarary with the OS dynamic linker, which in no OS I'm aware of is capable of loading libraries inside of zip files.

      Not sure where I got that notion. Maybe I was overthinking this.

      25 replies →

Compared to .NET's P/Invoke this is still way too convoluted. Of course Java has its own domain problem such as treating everything as a reference (and thus pointer, there is a reason Java has NullPointerException rather than NullReferenceException) and the lack of stack value types (everything lives on heap unless escape analysis allows some data to stay on stack, but it is uncontrollable anyway) makes translation of Plain-Old-Data (POD) types in Java very difficult, which is mostly a no-op with C#. That's why JNI exists as a mediator between native code and Java VM.

In C# I can just do something like this conceptual code:

```

// FILE *fopen(const char *filename, const char *mode)

[DllImport("libc")] public unsafe extern nint fopen([MarshalAs(UnmanagedType.LPStr)] string filename, [MarshalAs(UnmanagedType.LPStr)] string mode);

// char *fgets(char *str, int n, FILE *stream)

[DllImport("libc")] public unsafe extern nint fgets([MarshalAs(UnmanagedType.LPStr)] string str, int n, nint stream);

// int fclose(FILE *stream)

[DllImport("libc")] public unsafe extern int fclose(nint stream);

```

So much less code, and so much more precise than any of the Java JNI and FFI stuff.

  • Java's FFI is currently a very low-level. As the article points you, you don't actually have to do this: the jextract tool will generate the bindings for you from header files.

    I'm sure someone will come along and write annotations to do exactly as you describe there. The Java language folks tend to be very conservative about putting stuff in the official API, cuz they know it'll have to stay there for 30+ years. They prefer to let the community write something like annotations over low-level APIs.

    Anyway, the GraalVM folks don't have quite the same limitations as Java, so they have annotations already (https://yyhh.org/blog/2021/02/writing-c-code-in-javaclojure-...):

        @CStruct("MDB_val")
        public interface MDB_val extends PointerBase {
    
            @CField("mv_size")
            long get_mv_size();
    
            @CField("mv_size")
            void set_mv_size(long value);
    
            @CField("mv_data")
            VoidPointer get_mv_data();
    
            @CField("mv_data")
            void set_mv_data(VoidPointer value);
        }

  • Can be even simpler now (you can declare it as a local function in a method, so this works when copied to Program.cs as is):

        var text = "Hello, World!"u8;
        write(1, text, text.Length);
    
        [DllImport("libc")]
        static extern nint write(nint fd, ReadOnlySpan<byte> buf, nint count);
    

    (note: it's recommended to use [LibraryImport] instead for p/invoke declarations that require marshalling as it does not require JIT/runtime marshalling but just generates (better) p/invoke stub at build time)

  • Yep, that is my main complaint, and why I will rather reach to JNI instead.

Does this mean that one can use SDL2 together with Java without bending over backwards?

  • It seems it will make it somewhat easier.

    But if you want to use SDL2 from something higher-level, you will be much better served by C# which will give you minimal FFI cost and most data structures you want to express in C as-is.

    • I don't know much about C#. It certainly looks more popular in gamedev circles.

      When I played with this new java api. I wasn't worried about the FFI cost. It seemed fast enough to me. My toy application was performing about 0.77x of pure C equivalent. I think Java's memory model and heavy heap use might hurt more. Hopefully Java will catch up when it gets value objects with Project Valhalla. Next decade or so :)

      13 replies →

    • You're are (or were) right. Java has (had) an awful performance of a foreign API call, and I wonder was this fixed in this release, because as I heard, fixing it was the main reason of the upcoming functionality

      1 reply →

Calling C is easy. But how do you call C++? Shiboken has a language that let's you express ownership properties on C++ data structures/methods/functions. It's tailored to generating Python FFI bindings though. It would be so nice if there were a cross-platform language to do this.

  • The answer is basically you don't. It's impossible to make a sane, stable FFI for a language unless you put it behind a C ABI, which is relatively basic, but this is exactly why it's most suitable for FFI: implementing support for calling C functions is way more trivial than figuring out how to call the latest C++/Rust/etc monstrosity.

    • > The answer is basically you don't. It's impossible to make a sane, stable FFI for a language unless you put it behind a C ABI

      The Swift folks have put a lot of effort into attaining a stable ABI that's native to their language. They can achieve that because Swift is the officially endorsed language for development on Mac OS and iOS, so it (together with the platform itself) can set a standard that other languages will have to live with.

      In a way, software VM's like the JVM and CLR can also be said to define 'ABIs' of sorts within their runtime, that every language implementation on these runtimes will have to deal with.

    • There do exist ABIs that aren't the C ABI. But saying "use the C ABI" is far more portable than anything else.

      I can also point to the GCC Inline Assembler as an excellent way to call arbitrary functions whether they implement the standard C procedure call standard or not. By providing the list of arguments and what register they correspond to, along with the clobber list, you know everything you need to know to call the function. So it's more suitable for "fastcall" type functions where you need the arguments to correspond to particular registers.

      But of course, ASM isn't portable.

    • Swift ABI proves this to be wrong, but also showcases the complexity that goes with ABI of such kind.

  • This is something new. Before it you had to create a native-compatible shared library that returns jString/jObject instead or use a proxy which did this for you (JNA). Let's see what happens next, maybe even shiboken

Not directly related to the artcile,but is there any article that explain how memory management (stack/heap) work when using FFI in java. Also when a call is made though FFI to a C library, is there a separate java and C call stack? I haven't found a good article yet on what happens under the hood.

  • For the heap, JEP 454 is reasonably detailed: https://openjdk.org/jeps/454

    It describes how to adopt memory from C and have C adopt memory you allocate, and gives control over how memory is allocated in an arena.

    The arena has lifecycle boundaries, and allocations determine the memory space available. Java guarantees (only) that you can't use unallocated memory or memory outside the arena, and if you access via a (correct) value layout, you should be able to navigate structure correctly.

    The interesting stuff is passing function pointers back and forth - look for `downcall method handles`.

  • Don't have an article, but the gist on stacks is that Java still uses the regular architecture stack (rsp on x86, etc) that the FFI'd code will, and on exit to/entry from FFI it'd have to store its stack end/start pointer (or otherwise be able to figure the range out) such that GC knows what to scan.

    • I wonder how it works when you use virtual threads. In Go, goroutines have resizable stacks which notoriously complicates FFI because C has no idea about resizable stacks (IIRC they have to temporarily switch to a separate, special C stack).

      1 reply →

I had a C library I needed to ideally use from Java directly. The new FFI API looks great, but unfortunately the C API relied heavily on macros and void* arguments, making it incredibly difficult to model from Java.

  • I would give the jextract tool a try. I believe it uses LLVM to parse the header files, so the generated bindings might actually be pretty good.

    • I did use jextract for java 19+20, but it looked very messy.

      Tried it again yesterday on java 22, and the helpers from jextract are waaaay better. I actually completed a MVP implementation this time in a couple of hours. This could perhaps be released as library if I find the effort to wrap it in a meaningful way!

      We currently wrap this in java by calling the binary with subprocesses, which has been working great at some latency overhead. The big bonus of this though, is that we can kill the process from java when it misbehaves. Putting this C code inside Java again, means we likely lose that control.

I can’t see why I’d ever reach for it, but I do like knowing that Java is actively being improved over time

What’s the use case here? Developing drivers with Java?

  • Invoking native code has always been necessary in Java. In the past it was done via JNI which has many issues. These new APIs solve the issues and simplify the API. The use case is interacting with anything that isn't written in Java.

    • Blast from the past! I remember doing JNI integration in Java around 2003! It's been so long I don't remember details but you had to declare some interfaces in java, then some middleware .h or .c and then call the native library iirc.

      Glad to see things are progressing!!

  • Same use case as to why .NET has low/zero-cost FFI.

    This is similar, except more boilerplate and much, much slower.

    • The FFM downcalls in OpenJDK compile down to argument shuffling + a CALL instruction (in "critical" linker mode), i.e. the same machine code gcc/clang would generate for a call from a C program.

      3 replies →

    • I still have some hopes that it will evolve towards a P/Invoke like experience.

      While a step closer to Valhala, the whole dev experience is still quite lacking versus what .NET offers.

      Currently is too much like making direct use of InteropServices.

      18 replies →

    • > This is similar, except more boilerplate and much, much slower.

      That's JNI, which really was truly terrible. Java 22 introducing FFM is finally an admission that JNI was crap and a dead end.

      4 replies →

C# does a much better job of calling into C Code. All the programmer has to do is either write a extern function with the "DllImport" attribute, or they can turn a raw function pointer into a delegate. (Or even directly use a function pointer in newer versions of C#)

Last time I checked (ca. 2017-9) every call to foreign API in Java had to create a memory barrier causing flush of all CPU cache. This was different to using normal JVM interfaces and when I asked some guy on a Java conference, he told me they cheated during writing of calls to JVM API, but other people need to adhere to rules. I wonder what happened in this matter in Java 22, as this change was highly expected

  • Memory barriers don't force a flush of all CPU cache. They will enforce the ordering of memory operations issued before and after the barrier instruction, preserving the contents of the CPU's various caches.