From 9cb29409674435034cfe7f2fba263604b1afbb97 Mon Sep 17 00:00:00 2001 From: Olof-Joachim Frahm Date: Wed, 11 Nov 2015 17:14:37 +0100 Subject: [PATCH] Add post using LIRE with ABCL, FFI DSL discussion. --- abcl-and-lire.post | 230 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 230 insertions(+) create mode 100644 abcl-and-lire.post diff --git a/abcl-and-lire.post b/abcl-and-lire.post new file mode 100644 index 0000000..bb1c1ca --- /dev/null +++ b/abcl-and-lire.post @@ -0,0 +1,230 @@ +;;;;; +title: Setting up ABCL and LIRE +date: 2015-11-11 20:34:29 +format: md +tags: lisp +bindings: 3bmd-code-blocks:*code-blocks-default-colorize* :common-lisp +;;;;; + +## Intro + +The purpose of this article is to examine how using ABCL with existing +libraries (arguably *the* main point of using ABCL at the moment) +actually looks like in practice. Never mind integration with Spring, or +other more involved frameworks, this will only touch a single library +and won't require us to write from-Java-callable classes. + +In the process of refining this I'm hoping to also get ideas about the +requirements for building a better DSL for the Java FFI, based on the +intended "look" of the code (that is, coding by wishful thinking). + +## Setup + +Ensuring the correct package is somewhat optional: + +``` +(in-package #:cl-user) +``` + +Generally using JSS is a bit nicer than the plain Java FFI. After the +contribs are loaded, JSS can be required and used: + +``` +(require #:jss) + +(use-package '#:jss) +``` + +Next, we need access to the right libraries. After building LIRE from +source and executing the `mvn dist` command we end up with a JAR file +for LIRE and several dependencies in the `lib` folder. All of them need +to be on the classpath: + +``` +(progn + (add-to-classpath "~/src/LIRE/dist/lire.jar") + (mapc #'add-to-classpath (directory "~/src/LIRE/dist/lib/*.jar"))) +``` + +## Prelude + +Since we're going to read pictures in a couple of places, a helper to +load one from a pathname is a good start: + +``` +(defun read-image (pathname) + (#"read" 'javax.imageio.ImageIO (new 'java.io.FileInputStream (namestring pathname)))) +``` + +To note here is the use of `NEW` from JSS with a symbol for the class +name, the conversion of the pathname to a regular string, since the Java +side doesn't expect a Lisp object and the `#""` reader syntax from JSS +to invoke the method `read` in a bit of a simpler way than using the FFI +calls directly. + +Since we can't "import" Java names, we're stuck with either using +symbols like this, or caching them in (shorter) variables. + +For comparison, the raw FFI would be a bit more verbose: + +``` +(defun read-image (pathname) + (jstatic "read" "javax.imageio.ImageIO" (jnew "java.io.FileInputStream" (namestring pathname)))) +``` + +Though with a combination of JSS and cached lookup it could be nicer, +even though the setup is more verbose: + +``` +(defvar +image-io+ (jclass "javax.imageio.ImageIO")) +(defvar +file-input-stream+ (jclass "java.io.FileInputStream")) + +(defun read-image (pathname) + (#"read" +image-io+ (jnew +file-input-stream+ (namestring pathname)))) +``` + +At this point without other improvements (auto-coercion of pathnames, +importing namespaces) it's about as factored as it will be (except +moving every single call into its own Lisp wrapper function). + +## Building an index + +To keep it simple building the index will be done from a list of +pathnames in a single step while providing the path of the index as a +separate parameter: + +``` +(defun build-index (index-name pathnames) + (let ((global-document-builder + (new 'net.semanticmetadata.lire.builders.GlobalDocumentBuilder + (jclass "net.semanticmetadata.lire.imageanalysis.features.global.CEDD"))) + (index-writer (#"createIndexWriter" + 'net.semanticmetadata.lire.utils.LuceneUtils + index-name + +true+ + (jfield "net.semanticmetadata.lire.utils.LuceneUtils$AnalyzerType" "WhitespaceAnalyzer")))) + (unwind-protect + (dolist (pathname pathnames) + (let ((pathname (namestring pathname))) + (format T "Indexing ~A ..." pathname) + (let* ((image (read-image pathname)) + (document (#"createDocument" global-document-builder image pathname))) + (#"addDocument" index-writer document)) + (format T " done.~%"))) + (#"closeWriter" 'net.semanticmetadata.lire.utils.LuceneUtils index-writer)))) +``` + +The process is simply creating the document builder and index writer, +reading all the files one by one and adding them to the index. There's +no error checking at the moment though. + +To note here is that looking up the precise kind of a Java name is a bit +of a hassle. Of course intuition goes a long way, but again, manually +figuring out whether a name is a nested class or static/enum field is +annoying enough since it involves either repeated calls to `JAPROPOS`, +or reading more Java documentation. + +Apart from that, this is mostly a direct transcription. Unfortunately +written this way there's no point in creating a `WITH-OPEN-*` macro to +automatically close the writer, however, looking at the `LuceneUtils` +source this could be accomplished by directly calling `close` on the +writer object instead. + +It would also be nice to have auto conversion using keywords for enum +values instead of needing to look up the value manually. + +## Querying an index + +The other way round, looking up related pictures by passing in an +example, is done using an image searcher: + +``` +(defun query-index (index-name pathname) + (let* ((image (read-image pathname)) + (index-reader (#"open" 'org.apache.lucene.index.DirectoryReader + (#"open" 'org.apache.lucene.store.FSDirectory + (#"get" 'java.nio.file.Paths index-name (jnew-array "java.lang.String" 0)))))) + (unwind-protect + (let* ((image-searcher (new 'GenericFastImageSearcher 30 (jclass "net.semanticmetadata.lire.imageanalysis.features.global.CEDD"))) + (hits (#"search" image-searcher image index-reader))) + (dotimes (i (#"length" hits)) + (let ((found-pathname (#"getValues" (#"document" index-reader (#"documentID" hits i)) + (jfield "net.semanticmetadata.lire.builders.DocumentBuilder" + "FIELD_NAME_IDENTIFIER")))) + (format T "~F: ~A~%" (#"score" hits i) found-pathname)))) + (#"closeReader" 'net.semanticmetadata.lire.utils.LuceneUtils index-reader)))) +``` + +To note here is that the `get` call on `java.nio.file.Paths` took way +more time to figure out than should've been necessary: Essentially the +method is using a variable number of arguments, but the FFI doesn't help +in any way, so the array (of the correct type!) needs to be set up +manually, especially if the number of variable arguments is zero. This +is not obvious at first and also takes unnecessary writing. + +The rest of the code is straightforward again. At least a common +wrapper for the `length` call would be nice, but since the result object +doesn't actually implement a collection interface, the point about +having better collection iteration is kind of moot here. + +## A better DSL + +Considering how verbose the previous examples were, how would the +"ideal" way look like? + +There are different ways which are more, or less intertwined with Java +semantics. On the one end, we could imagine something akin to "Java in +Lisp": + +``` +(import '(javax.imageio.ImageIO java.io.FileInputStream)) + +(defun read-image (pathname) + (ImageIO/read (FileInputStream. pathname))) +``` + +Which is almost how it would look like in Clojure. However, this is +complicating semantics. While importing would be an extension to the +package mechanism (or possibly just a file-wide setting), the +`Class/field` syntax and `Class.` syntax are non-trivial reader +extensions, not from the actual implementation point of view, but from +the user point of view. They'd basically disallow a wide range of +formerly legal Lisp names. + +``` +(import '(javax.imageio.ImageIO java.io.FileInputStream)) + +(defun read-image (pathname) + (#"read" 'ImageIO (new 'FileInputStream pathname))) +``` + +This way is the middle ground that would be possible. The one addition +to the current JSS system is the importing of Java names and +corresponding interaction with the FFI. + +The similarity with CLOS would be the use of symbols for class names, +but the distinction is still there, since there's not much in terms of +integrating CLOS and Java OO yet (which might not be desirable anyway?). + +Auto-coercion to Java data types also takes place in both cases. +Generally this would be appropriate, except for places where we'd really +want the Java side to receive a Lisp object. Having a special variable +to *disable* conversion might be enough for these purposes. + +## Summary + +After introducing the necessary steps to start using ABCL with "native" +Java libraries, we transcribed two example programs from the library +homepage. + +Part of this process was to examine how the interaction between the +Common Lisp and Java parts looks like, using the "raw" and the +simplified JSS API. In all cases the FFI is clunkier than needs be. +Especially the additional Java namespaces are making things longer than +necessary. The obvious way of "importing" classes by storing a +reference in a Lisp variable is viable, but again isn't automated. + +Based on the verbose nature of the Java calls an idea about how a more +concise FFI DSL could look like was developed next and discussed. At a +future point in time this idea could now be developed fully and +integrated (as a contrib) into ABCL. -- 1.7.10.4