When your code fails after being packaged as JAR
This is "today I learned" kind of post. The code I want to show is may appear boring by itself as it just loads a file from resources. I found it interesting though because that code works when being run from tests (e.g. when run with sbt test
), whereas it fails after being packaged as a JAR. The realization of that was the beginning of an engaging investigation.
I am using Scala in this post but the essence remains the same for any code targeting JVM.
The case
Let's say we want to read a CSV file using scala-csv
library. CSVReader
has method open
which accepts an argument of type File
. Thus, providing we want to read a file from the filesystem, we can write something like this:
def readFromFilesystem: List[List[String]] = {
CSVReader.open(new File("sample.csv")).all
}
However, the case I want to focus on in this post is reading from a resource. We can start with the following code:
def readAsResource: List[List[String]] = {
val classloader = Thread.currentThread.getContextClassLoader
val url = classloader.getResource("resource.csv")
val file = Paths.get(url.toURI).toFile
CSVReader.open(file).all()
}
It is slightly more involving, and that toURI
looks a bit dubious, but let's give it a try. We will also write a test so the potential problem should be caught by it.
We can put both methods into the main
method:
object Main {
def main(args: Array[String]): Unit = {
println(s"readFromFilesystem: ${Reader.readFromFilesystem}")
println(s"readAsResource: ${Reader.readAsResource}")
}
}
Then we run it with sbt reStart
which produces the following result:
readFromFilesystem: List(List(a, b, c), List(d, e, f))
readAsResource: List(List(g, h, i), List(j, k, l))
This is exactly what is expected.
If we create a test it will also work:
[info] Tests: succeeded 2, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
Everything looks fine. Then - time to deploy?
> sbt assembly
...
> java --show-version -jar target/scala-2.13/read-resource-assembly-1.0.jar
openjdk 11.0.2 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
readFromFilesystem: List(List(a, b, c), List(d, e, f))
Exception in thread "main" java.nio.file.FileSystemNotFoundException
at jdk.zipfs/jdk.nio.zipfs.ZipFileSystemProvider.getFileSystem(ZipFileSystemProvider.java:169)
at jdk.zipfs/jdk.nio.zipfs.ZipFileSystemProvider.getPath(ZipFileSystemProvider.java:155)
at java.base/java.nio.file.Path.of(Path.java:208)
at java.base/java.nio.file.Paths.get(Paths.java:97)
at pl.msitko.Reader$.readAsResource(Reader.scala:21)
at pl.msitko.Main$.main(Main.scala:10)
at pl.msitko.Main.main(Main.scala)
Oops, it does not look good, let's see what went wrong.
Diving in
If we print out classloader.getResource("resource.csv")
for packaged application we will see:
jar:file:/path/to/the/project/target/scala-2.13/read-resource-assembly-1.0.jar!/resource.csv
By the way, if we print out the same during tests the result will be file:/path/to/the/project/target/scala-2.13/classes/resource.csv
which explains why that code worked
when being run as test. During tests resource's URL points to the local file system.
Stack trace mentions ZipFileSystemProvider
, after taking a look at its code and some legacy docs we may try to:
def readAsResource: List[List[String]] = {
val classloader = Thread.currentThread.getContextClassLoader
val url = classloader.getResource("resource.csv")
// the next three lines are new compared to the previous code
val jarProvider = FileSystemProvider.installedProviders.asScala.toList.filter(_.getScheme == "jar").head
val jarUrl = new URI("jar:file:/path/to/the/project/target/scala-2.13/read-resource-assembly-1.0.jar")
jarProvider.newFileSystem(jarUrl, Map.empty[String, Any].asJava)
val file = Paths.get(url.toURI).toFile
CSVReader.open(file).all()
}
That code is quite naive and assumes we know the location of JAR file beforehand, but we are just playing around here. It yields:
readFromFilesystem: List(List(a, b, c), List(d, e, f))
Exception in thread "main" java.lang.UnsupportedOperationException
at jdk.zipfs/jdk.nio.zipfs.ZipPath.toFile(ZipPath.java:661)
at pl.msitko.Reader$.readAsResource(Reader.scala:25)
at pl.msitko.Main$.main(Main.scala:10)
at pl.msitko.Main.main(Main.scala)
There is some progress: instead of previous FileSystemNotFoundException
, we got UnsupportedOperationException
. After looking at ZipPath.toFile
implementation the culprit seems obvious:
@Override
public final File toFile() {
throw new UnsupportedOperationException();
}
That implementation makes sense considering that java.io.File
is meant to model local files. There is simply no local path for a collection of bytes within ZIP file (JAR is technically a ZIP file). To conclude - URL returned by ClassLoader.getResource
cannot be converted to java.io.File
as a resource cannot be expressed as java.io.File
.
Back to initial task
With that conclusion we can go back to the initial scala-csv example. Another method for working with resources provided by ClassLoader
is getResourceAsStream
. We cannot use it directly as CSVReader
has no API entry which accepts InputStream
. Fortunately, among numerous overloaded CSVReader.open
methods there is one which uses java.io.Reader
as an argument. So we can rewrite code which loads CSV from resource:
def readResourceUsingReader: List[List[String]] = {
val classloader = Thread.currentThread.getContextClassLoader
val stream = classloader.getResourceAsStream("resource.csv")
val reader = new InputStreamReader(stream, java.nio.charset.StandardCharsets.UTF_8)
CSVReader.open(reader).all()
}
By using getResourceAsStream
we avoid issues with File
at all.
More on Zip File System Provider
Since Java SE 7 release Zip File System Provider is being included as part of JVM. We managed to make it work using newFileSystem
and managed to resolve URL into Path
. Thanks to that we can use any API which uses Path
, for example, we can read all bytes of that resource file with Files.readAllBytes
.
That being said - that code is quite hacky and I would consider it as last resort solution.
Key takeaways
- As a library developer, you should provide alternatives to API using
java.io.File
.java.nio.file.Path
is probably a good idea as it is more general. - You should realize that if your application is packaged as JAR there's no resource file at runtime. There's only a single JAR file and classloader which knows how to resolve resource path. While it may sound obvious to many readers, it can be really counterintuitive to many developers because they spend most of their time simply developing their code. At development time a simple association
resource = file
works, but at runtime it is no longer valid. - As a consequence of the above point - be cautious with
java.lang.Classloader.getResource
as it returns URL not convertible tojava.io.File
. What is worse - you will learn about it as late as after packaging and running the code. - Be mindful of differences between environment in which you run tests and production environment. The example described here is just one of a few differences between running Java code inside of your build tool and from within JAR.
Github repository
Repository with code used in this article