Discover a Fresh Modular Deep Learning Library Built in Java – Designed for True Customization!
Researchers at Politeknik Elektronika Negeri Surabaya (PENS) have published a detailed open-access paper in the Journal of Applied Data Sciences that outlines the creation of a highly modular deep learning software library entirely in Java.
Titled “Process Design of Software Library Development for Deep Learning Module in Java Programming with Four-Phase Methodology,” this work directly tackles a common frustration with popular frameworks like PyTorch and TensorFlow: their monolithic, tightly coupled designs make it really hard for researchers to easily modify core mathematical formulas, loss functions, activations, or training logic without fighting the codebase.
The proposed library changes that by prioritizing modularity and extensibility from day one. It allows users whether in academia or industry to inject custom equations and tweak training routines with minimal friction. As part of the Analytical Library for Intelligent Computing (ALI), it complements existing ALI modules such as automatic clustering, hierarchical K-Means, KNN, and basic neural networks.
What makes this stand out is the authors' clear, replicable four-phase methodology that serves as a complete blueprint for building specialized DL libraries.
They start with Preparation, where the team brainstorms ideas, creates guiding questions (e.g., exploring why deep learning dominates complex tasks and how to make it more flexible), and conducts a thorough literature review comparing tools like PyTorch (great for dynamic tensors and custom ops), Torchreid (person re-identification focus), and TorchIO (medical imaging preprocessing). This phase maps gaps and sets the conceptual foundation.
Next comes Identification, extracting key terms, defining measurable goals, and setting up the environment.
Java was chosen for its superior speed on large datasets (often faster than Python), efficient garbage collection, massive developer ecosystem (around 9 million developers, deployed on over 3 billion devices), and strong CPU performance. Four core design principles guide everything understandable and customizable for end-users; maintainable and extensible for library maintainers.
Concrete targets include at least 1,000 downloads in the first 100 days after release, 97 functions passing unit tests with perfect "green" status, execution times in the millisecond range, and Big-O scalability checks.
The Design phase delivers the technical blueprint using Clean Architecture with four concentric layers: Data, Model, Layer, and Computation.
They created UML class diagrams, use case diagrams, architectural decision records, quality-attribute scenarios, and API specifications 11 detailed documents in total.
Tools like Miro for collaborative brainstorming, Diagrams.net for professional UML/flowcharts, and Notion for Kanban-style task tracking kept everything organized. The Builder pattern was favored over Singleton for flexible, thread-safe layer instantiation (e.g., easily swapping convolution kernels like edge detection or sharpening filters).
Finally, the Development phase brings it to life through agile sprints focused on CNNs initially.
They implemented 97 functions across these modules, achieved high test coverage with JUnit, enforced Google Java Style, and ran static analysis. Early CPU benchmarks are impressive: single-image prediction takes just 4.729 ms in their library versus 182.729 ms in TensorFlow roughly 38x faster in this test, thanks to Java's static typing and CPU optimizations. Model file size is reasonable at ~570 KB, and customizability shines through direct kernel/equation modifications.
While GPU support is planned for later, the current version emphasizes pure-Java, CPU-optimized, maintainable code ideal for education, prototyping, and environments without heavy GPU access.
This paper fills a real literature gap: most DL libraries highlight features or performance but rarely share structured planning, design rationale, and architectural decisions in depth. By documenting everything from ideation flowcharts to ADRs and performance KPIs the authors provide a practical, step-by-step framework that other researchers can follow to build their own specialized tools.
If you're someone who experiments with custom losses, novel activations, or optimized convolution ops, or if you're just interested in Java-powered AI, this is worth checking out!