How to Use the Foreign Function API in Java 22 to Call C Libraries

14 days ago (ifesunmola.com)

145 comments

pjmlp

I am sort of surprised that there isn't a widely used tool that uses codegen to generate jni bindings sort of like what the jna does but at build time. You could go meta and bundle a builder in a jar that looks for the shared library in a particular place and shells out to build and install the native library if it is missing on the host computer. This would run once pretty similar I think to bundling native code in npm.

I have bundled shared libraries for five or six platforms in a java library that needs to make syscalls. It works but it is a pain if anything ever changes or a new platform needs to be brought up. Checking in binaries always feels icky but is necessary if not all targets can be built on a single machine.

The problem with the new api is that people upgrade java very slowly in most contexts. For an oss library developer, I see very little value add in this feature because I'm still stuck for all of my users who are using an older version of java. If I integrate the new ffi api, now I have to support both it and the jni api.

MaxBarraclough 11 days ago
> I am sort of surprised that there isn't a widely used tool that uses codegen to generate jni bindings sort of like what the jna does but at build time
There are several, including SWIG.
- andoando 11 days ago
  
  Which is still a pita to use unless maybe you really know what youre doing.
  
  1 reply →
lelanthran 11 days ago

There is SWIG, which does bings to and from C for almost every language that exists.
PaulHoule 11 days ago

Back in 1998 I wrote a code generator to make JNI stubs for LAPACK. It’s the kind of programming that goes that way.
gudzpoz 11 days ago

There is a library called jnigen [1], mainly used by the libGDX framework [2]. But I don't see it used in many other projects though. Personally I use it to maintain a set of Lua C API bindings for some platforms [3] and it works sort of OK once you manage to somehow set up a workflow for building and testing the binaries.
> It works but it is a pain if anything ever changes or a new platform needs to be brought up. Checking in binaries always feels icky but is necessary if not all targets can be built on a single machine.
It is definitely a pain when you cannot test all changes on a single local machine. But I would argue that it is true whenever multiple platforms (or maybe even multiple glibc versions) are involved, regardless of what languages/libraries/tools you use.
[1] https://libgdx.com/wiki/utils/jnigen [2] https://github.com/libgdx/libgdx [3] https://github.com/gudzpoz/luajava

marginalia_nu 12 days ago

What I'm missing is a model for building/distributing those C libraries with a java application.

Every ffi example I've found seem to operate on the assumption that you want to invoke syscalls or libc, which (with possibly the exception of like madvise and aioring) Java already mostly has decent facilities to interact with even without native calls.

sedro 12 days ago
Native libraries are typically packaged inside a jar so that everything works over the existing build and dependency management systems.
For example, each these jars named "native-$os-$arch.jar" contain a .dll/.so/.dylib: https://repo1.maven.org/maven2/com/aayushatharva/brotli4j/
JNA will extract the appropriate native library (using os.name and os.arch system properties), save the library to a temp file, then load it.
- throwaway2037 11 days ago
  
  > JNA will extract the appropriate native library ..., save the library to a temp file, then load it.
  JNA does this?
  FYI: JNA = Java Native Access project: https://github.com/java-native-access/jna
  
  1 reply →
- okr 10 days ago
  
  Examples of JARs, that transport such libraries: snappy, sqlite...
gwbas1c 11 days ago

> Every ffi example I've found seem to operate on the assumption that you want to invoke syscalls or libc ... Java already mostly has decent facilities to interact with even without native calls.
Because you would use ffi to interact with libraries that don't have Java wrappers yet: IE, you're writing the wrapper.
Using syscalls or libc is a way to write an example against a known library that you're probably familiar with.
pron 12 days ago
The recommended distribution model for Java applications is a jlinked runtime image [1], which supports including native libraries in the image.
[1]: Technically, this is the only distribution model because all Java runtimes as of JDK 9 are created with jlink, including the runtime included in the JDK (which many people use as-is), but I mean a custom runtime packaged with the application.
- maksut 12 days ago
  
  Is that still true when distributing libraries?
  
  2 replies →
aardvark179 11 days ago
So, other people have already answered this, but this does seem to be a gap where many developers lack some piece of knowledge to chain the whole solution together. You normally package this sort of thing by putting the native library in a jar, extracting it to a tmp file that will be deleted on exit, and opening that dynamic library.
I’ve met many perfectly reasonable developers who do know all those steps can be done but can’t put them all together - maybe because it just hasn’t clicked that you can store a library in a jar. It feels like something tutorials should cover, but I think falls into the, “surely everyone can work it out?” category.
- xxs 11 days ago
  
  >exract it to a tmp file that will be deleted on exit,
  actually you delete it immediately (after load) on anything that's not windows... even then but it's likely to return false.
  deleteOnExit just stores the path to delete and uses a shutdownHook to actually call delete. Nothing really special about it
- sedro 11 days ago
  
  You would also need to learn about Maven profiles and activation. And for other build tools, you'll be delighted to know they have partial support.
- chii 11 days ago
  
  > extracting it to a tmp file
  i wonder if there's a way to do this entirely in memory? Because some deployment scenarios might not have disk space at all.
  
  3 replies →
mike_hearn 11 days ago

If your app is open source, or you're willing to buy a commercial tool, then you could try Conveyor from my company [1]. It will:
- Find all the shared libraries in your JARs or configured app inputs (files in your build/source tree)
- Sniff them to figure out what OS and CPU arch they are for
- Bundle them into the right package for each platform that it makes, in the right place to be found by System.loadLibrary()
- Sign them if necessary
- Delete them from the JARs now they are extracted. Optionally extract them from library JARs, sign them and then put them back if your library refuses to load the shared library from disk instead of unpacking it (most libs don't need this)
- JLink a bundled JVM for your app for each platform you target, using jdeps to figure out the right set of modules, and combine that with your shared libs.
When building Debian/Ubuntu packages it will also:
- Read the .so library dependencies, look up the packages that contain those other shared libraries and add package dependencies on those packages, so "apt install" will do the right thing.
So that makes it a lot easier to distribute Java apps that use native code.
[1] https://www.hydraulic.dev/
pjmlp 12 days ago
You do it the standard way, package them inside the jar file.
- marginalia_nu 12 days ago
  
  Oh, does this actually work?
  I was on the assumption that it was dynamically linking the libarary with the OS dynamic linker, which in no OS I'm aware of is capable of loading libraries inside of zip files.
  Not sure where I got that notion. Maybe I was overthinking this.
  
  25 replies →
- fire_lake 12 days ago
  
  Is there a solution when the binaries are 500mb+ per platform?
  
  10 replies →
ruslan_talpa 12 days ago

Put them in a jar?

stevefan1999 11 days ago

Compared to .NET's P/Invoke this is still way too convoluted. Of course Java has its own domain problem such as treating everything as a reference (and thus pointer, there is a reason Java has NullPointerException rather than NullReferenceException) and the lack of stack value types (everything lives on heap unless escape analysis allows some data to stay on stack, but it is uncontrollable anyway) makes translation of Plain-Old-Data (POD) types in Java very difficult, which is mostly a no-op with C#. That's why JNI exists as a mediator between native code and Java VM.

In C# I can just do something like this conceptual code:

```

// FILE *fopen(const char *filename, const char *mode)

[DllImport("libc")] public unsafe extern nint fopen([MarshalAs(UnmanagedType.LPStr)] string filename, [MarshalAs(UnmanagedType.LPStr)] string mode);

// char *fgets(char *str, int n, FILE *stream)

[DllImport("libc")] public unsafe extern nint fgets([MarshalAs(UnmanagedType.LPStr)] string str, int n, nint stream);

// int fclose(FILE *stream)

[DllImport("libc")] public unsafe extern int fclose(nint stream);

```

So much less code, and so much more precise than any of the Java JNI and FFI stuff.

the-alchemist 11 days ago
Java's FFI is currently a very low-level. As the article points you, you don't actually have to do this: the jextract tool will generate the bindings for you from header files.
I'm sure someone will come along and write annotations to do exactly as you describe there. The Java language folks tend to be very conservative about putting stuff in the official API, cuz they know it'll have to stay there for 30+ years. They prefer to let the community write something like annotations over low-level APIs.
Anyway, the GraalVM folks don't have quite the same limitations as Java, so they have annotations already (https://yyhh.org/blog/2021/02/writing-c-code-in-javaclojure-...):
@CStruct("MDB_val") public interface MDB_val extends PointerBase { @CField("mv_size") long get_mv_size(); @CField("mv_size") void set_mv_size(long value); @CField("mv_data") VoidPointer get_mv_data(); @CField("mv_data") void set_mv_data(VoidPointer value); }
- pjmlp 10 days ago
  
  I wasn't aware of it, great! One more point to the GraalVM folks.
neonsunset 11 days ago
Can be even simpler now (you can declare it as a local function in a method, so this works when copied to Program.cs as is):
var text = "Hello, World!"u8; write(1, text, text.Length); [DllImport("libc")] static extern nint write(nint fd, ReadOnlySpan<byte> buf, nint count);
(note: it's recommended to use [LibraryImport] instead for p/invoke declarations that require marshalling as it does not require JIT/runtime marshalling but just generates (better) p/invoke stub at build time)
pjmlp 11 days ago

Yep, that is my main complaint, and why I will rather reach to JNI instead.

xyproto 12 days ago

Does this mean that one can use SDL2 together with Java without bending over backwards?

maksut 11 days ago

I have played with raylib bindings for clojure by using the new foreign function api. It was a lot of fun. SDL might be a better fit because it prefers pass by reference arguments [0].
[0] https://gist.github.com/raysan5/17392498d40e2cb281f5d09c0a4b...
neonsunset 12 days ago
It seems it will make it somewhat easier.
But if you want to use SDL2 from something higher-level, you will be much better served by C# which will give you minimal FFI cost and most data structures you want to express in C as-is.
- maksut 11 days ago
  
  I don't know much about C#. It certainly looks more popular in gamedev circles.
  When I played with this new java api. I wasn't worried about the FFI cost. It seemed fast enough to me. My toy application was performing about 0.77x of pure C equivalent. I think Java's memory model and heavy heap use might hurt more. Hopefully Java will catch up when it gets value objects with Project Valhalla. Next decade or so :)
  
  13 replies →
- p0w3n3d 11 days ago
  
  You're are (or were) right. Java has (had) an awful performance of a foreign API call, and I wonder was this fixed in this release, because as I heard, fixing it was the main reason of the upcoming functionality
  
  1 reply →
marginalia_nu 11 days ago

Oh man that's a cool idea.
Might just build a SDL2-wrapper for ffi just as an FFI and FMI-exercise.

alex_suzuki 12 days ago

Wonder if this will make JNA (Java Native Access) redundant at some point: https://github.com/java-native-access/jna

Very useful, especially the prebundled platform bindings.

iso8859-1 11 days ago

Calling C is easy. But how do you call C++? Shiboken has a language that let's you express ownership properties on C++ data structures/methods/functions. It's tailored to generating Python FFI bindings though. It would be so nice if there were a cross-platform language to do this.

qweqwe14 11 days ago
The answer is basically you don't. It's impossible to make a sane, stable FFI for a language unless you put it behind a C ABI, which is relatively basic, but this is exactly why it's most suitable for FFI: implementing support for calling C functions is way more trivial than figuring out how to call the latest C++/Rust/etc monstrosity.
- zozbot234 11 days ago
  
  > The answer is basically you don't. It's impossible to make a sane, stable FFI for a language unless you put it behind a C ABI
  The Swift folks have put a lot of effort into attaining a stable ABI that's native to their language. They can achieve that because Swift is the officially endorsed language for development on Mac OS and iOS, so it (together with the platform itself) can set a standard that other languages will have to live with.
  In a way, software VM's like the JVM and CLR can also be said to define 'ABIs' of sorts within their runtime, that every language implementation on these runtimes will have to deal with.
- Dwedit 11 days ago
  
  There do exist ABIs that aren't the C ABI. But saying "use the C ABI" is far more portable than anything else.
  I can also point to the GCC Inline Assembler as an excellent way to call arbitrary functions whether they implement the standard C procedure call standard or not. By providing the list of arguments and what register they correspond to, along with the clobber list, you know everything you need to know to call the function. So it's more suitable for "fastcall" type functions where you need the arguments to correspond to particular registers.
  But of course, ASM isn't portable.
- neonsunset 11 days ago
  
  Swift ABI proves this to be wrong, but also showcases the complexity that goes with ABI of such kind.
secondcoming 11 days ago

You put your C++ behind a C API.
p0w3n3d 11 days ago

This is something new. Before it you had to create a native-compatible shared library that returns jString/jObject instead or use a proxy which did this for you (JNA). Let's see what happens next, maybe even shiboken
imtringued 11 days ago

I don't know why people don't know this, but you can just use GObject.
mike_hearn 11 days ago

There's javacpp which can do that.

creativeSlumber 12 days ago

Not directly related to the artcile,but is there any article that explain how memory management (stack/heap) work when using FFI in java. Also when a call is made though FFI to a C library, is there a separate java and C call stack? I haven't found a good article yet on what happens under the hood.

w10-1 11 days ago

For the heap, JEP 454 is reasonably detailed: https://openjdk.org/jeps/454
It describes how to adopt memory from C and have C adopt memory you allocate, and gives control over how memory is allocated in an arena.
The arena has lifecycle boundaries, and allocations determine the memory space available. Java guarantees (only) that you can't use unallocated memory or memory outside the arena, and if you access via a (correct) value layout, you should be able to navigate structure correctly.
The interesting stuff is passing function pointers back and forth - look for `downcall method handles`.
dzaima 11 days ago
Don't have an article, but the gist on stacks is that Java still uses the regular architecture stack (rsp on x86, etc) that the FFI'd code will, and on exit to/entry from FFI it'd have to store its stack end/start pointer (or otherwise be able to figure the range out) such that GC knows what to scan.
- kgeist 11 days ago
  
  I wonder how it works when you use virtual threads. In Go, goroutines have resizable stacks which notoriously complicates FFI because C has no idea about resizable stacks (IIRC they have to temporarily switch to a separate, special C stack).
  
  1 reply →

jakjak123 11 days ago

I had a C library I needed to ideally use from Java directly. The new FFI API looks great, but unfortunately the C API relied heavily on macros and void* arguments, making it incredibly difficult to model from Java.

the-alchemist 11 days ago
I would give the jextract tool a try. I believe it uses LLVM to parse the header files, so the generated bindings might actually be pretty good.
- jakjak123 8 days ago
  
  I did use jextract for java 19+20, but it looked very messy.
  Tried it again yesterday on java 22, and the helpers from jextract are waaaay better. I actually completed a MVP implementation this time in a couple of hours. This could perhaps be released as library if I find the effort to wrap it in a meaningful way!
  We currently wrap this in java by calling the binary with subprocesses, which has been working great at some latency overhead. The big bonus of this though, is that we can kill the process from java when it misbehaves. Putting this C code inside Java again, means we likely lose that control.

petesergeant 11 days ago

I can’t see why I’d ever reach for it, but I do like knowing that Java is actively being improved over time

xyst 12 days ago

What’s the use case here? Developing drivers with Java?

invalidname 12 days ago
Invoking native code has always been necessary in Java. In the past it was done via JNI which has many issues. These new APIs solve the issues and simplify the API. The use case is interacting with anything that isn't written in Java.
- xtracto 11 days ago
  
  Blast from the past! I remember doing JNI integration in Java around 2003! It's been so long I don't remember details but you had to declare some interfaces in java, then some middleware .h or .c and then call the native library iirc.
  Glad to see things are progressing!!
neonsunset 12 days ago
Same use case as to why .NET has low/zero-cost FFI.
This is similar, except more boilerplate and much, much slower.
- pron 12 days ago
  
  The FFM downcalls in OpenJDK compile down to argument shuffling + a CALL instruction (in "critical" linker mode), i.e. the same machine code gcc/clang would generate for a call from a C program.
  
  3 replies →
- pjmlp 12 days ago
  
  I still have some hopes that it will evolve towards a P/Invoke like experience.
  While a step closer to Valhala, the whole dev experience is still quite lacking versus what .NET offers.
  Currently is too much like making direct use of InteropServices.
  
  18 replies →
- miffy900 11 days ago
  
  > This is similar, except more boilerplate and much, much slower.
  That's JNI, which really was truly terrible. Java 22 introducing FFM is finally an admission that JNI was crap and a dead end.
  
  4 replies →
- trelane 12 days ago
  
  [flagged]

Dwedit 11 days ago

C# does a much better job of calling into C Code. All the programmer has to do is either write a extern function with the "DllImport" attribute, or they can turn a raw function pointer into a delegate. (Or even directly use a function pointer in newer versions of C#)

p0w3n3d 11 days ago

Last time I checked (ca. 2017-9) every call to foreign API in Java had to create a memory barrier causing flush of all CPU cache. This was different to using normal JVM interfaces and when I asked some guy on a Java conference, he told me they cheated during writing of calls to JVM API, but other people need to adhere to rules. I wonder what happened in this matter in Java 22, as this change was highly expected

ryanpetrich 11 days ago

Memory barriers don't force a flush of all CPU cache. They will enforce the ordering of memory operations issued before and after the barrier instruction, preserving the contents of the CPU's various caches.