Build Systems
Cache Solutions
Charting & Reporting
Chat Servers
Code Analyzers
Code Beautifiers
Code Coverage
Connection Pools
EJB Servers
Expression Languages
Forum Soft
General Purpose
HTML Parsers
Inversion of Control
Issue Tracking
J2EE Frameworks
JSP Tag Libraries
Job Schedulers
Logging Tools
Mail Clients
Network Clients
Network Servers
PDF Libraries
Parser Generators
RSS & RDF Tools
Rule Engines
SQL Clients
Scripting Languages
Search Engines
Source Control
Template Engines
Testing Tools
Text Processing
UML & Modeling
Web Frameworks
Web Mail
Web Servers
Web Services
Web Testing
Wiki Engines
XML Parsers
XML UI Toolkits

Open Source NoSQL Databases in Java

Berkeley DB Java Edition

Berkeley DB JE is a high performance, transactional storage engine written entirely in Java. Like the highly successful Berkeley DB product, Berkeley DB JE executes in the address space of the application, without the overhead of client/server communication. It stores data in the application's native format, so no runtime data translation is required. Berkeley DB JE supports full ACID transactions and recovery. It provides an easy-to-use interface, allowing programmers to store and retrieve information quickly, simply and reliably. Berkeley DB JE was designed from the ground up in Java. It takes full advantage of the Java environment. The Berkeley DB JE API provides a Java Collections-style interface, as well as a programmatic interface similar to the Berkeley DB API. The architecture of Berkeley DB JE supports high performance and concurrency for both read-intensive and write-intensive workloads. Berkeley DB JE is different from all other Java databases available today. Berkeley DB JE is not a relational engine built in Java. It is a Berkeley DB-style embedded store, with an interface designed for programmers, not DBAs. Berkeley DB JE's architecture employs a log-based, no-overwrite storage system, enabling high concurrency and speed while providing ACID transactions and record-level locking. Berkeley DB JE efficiently caches most commonly used data in memory, without exceeding application-specified limits. In this way Berkeley DB JE works with an application to use available JVM resources while providing access to very large data sets.

Go To Berkeley DB Java Edition


MyOODB (My Object-Oriented Database) is an integrated database and Web environment that provides true distributed objects, implicit/explicit multi-concurrent nested transactions, seamless Web tunneling, and database self-healing. MyOODB is one part of a two part SDK solution. Together with MyOOWEB, MyOOSDK provides a development environment for people who desire small, fast, but powerful applications.



Perst is an object-oriented embedded database for Java and .NET applications that need to deal with persistent data. It is easy to use and provides high performance. Although Perst is very simple, it provides fault-tolerant support (ACID transactions) and concurrent access to the database. The main advantage of Perst is its tight integration with programming languages.

Go To Perst


The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. Cassandra was open sourced by Facebook in 2008, and is now developed by Apache committers and contributors from many companies.

Go To Cassandra


HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop. HBase includes: * Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules * Query predicate push down via server side scan and get filters * Optimizations for real time queries * A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options * Extensible jruby-based (JIRB) shell * Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX

Go To HBase


HyperGraphDB 1.0 is a general purpose, extensible, portable, distributed, embeddable, open-source data storage mechanism. Designed specifically for artificial intelligence and semantic web projects, it can also be used as an embedded object-oriented database for projects of all sizes. HyperGraphDB is a Java based product built on top of the Berkeley DB storage library. It can be used as a single in-process database bound to a location on the local disk or within a cluster of networked database instances communicating and sharing data in a P2P (peer-to-peer) fashion. Key Features of HyperGraphDB include: * Storage of generalized hypergraphs. * Open, extensible type system. * Basic query system and graph traversal algorithms. * Out-of-box support for Java object storage. * Thread-safe transactions. * P2P framework for data distribution.

Go To HyperGraphDB


InfoGrid is an open-source internet graph database. Unlike a relational database, InfoGrid represents information as nodes and edges that relate nodes. Nodes can carry properties and can be dynamically blessed with types and unblessed. InfoGrid can be run on a variety of storage backends, including MySQL, PostgreSQL and NoSQL. InfoGrid also provides a REST-ful web frontend, user-centric identity technology, and a Probe framework for the seamless integration of external data sources.

Go To InfoGrid


Neo4j is a graph database, a fully transactional database that stores data structured as graphs. It is a mature and robust graph database that provides: * an intuitive graph-oriented model for data representation. Instead of static and rigid tables, rows and columns, you work with a flexible graph network consisting of nodes, relationships and properties. * a disk-based, native storage manager completely optimized for storing graph structures for maximum performance and scalability. * massive scalability. Neo4j can handle graphs of several billion nodes/relationships/properties on a single machine and can be sharded to scale out across multiple machines. * a powerful traversal framework for high-speed traversals in the node space. can be deployed as a full server or a very slim database with a small footprint (~500k jar). * a simple and convenient object-oriented API.

Go To Neo4j

NeoDatis ODB

NeoDatis ODB is a new generation Object Oriented Database. ODB is a real transparent persistence layer that allows anyone to persist native objects with a single line of code. ODB can be used as an embedded database engine that can be seamlessly integrated to any product as an embedded database or in client/server mode.

Go To NeoDatis ODB


OrientDB is a new Open Source NoSQL DBMS born with the best features of all the others. It's written in Java and it's amazing fast: can store up to 150,000 records per second on common hardware. Even if it's Document based database the relationships are managed as in Graph Databases with direct connections among records. You can travere entire or part of trees and graphs of records in few milliseconds. Supports schema-less, schema-full and schema-mixed modes. Has a strong security profiling system based on user and roles and support the SQL between the query languages. Thank to the SQL layer it's straightforward to use it for people skilled in Relational world.

Go To OrientDB


Terrastore is a modern document store which provides advanced scalability and elasticity features without sacrificing consistency. Terrastore is based on Terracotta, so it relies on an industry-proven, fast clustering technology. It supports single-cluster and multi-cluster deployments.

Go To Terrastore


Voldemort is a distributed key-value storage system * Data is automatically replicated over multiple servers. * Data is automatically partitioned so each server contains only a subset of the total data * Server failure is handled transparently * Pluggable serialization is supported to allow rich keys and values including lists and tuples with named fields, as well as to integrate with common serialization frameworks like Protocol Buffers, Thrift, Avro and Java Serialization * Data items are versioned to maximize data integrity in failure scenarios without compromising availability of the system * Each node is independent of other nodes with no central point of failure or coordination * Good single node performance: you can expect 10-20k operations per second depending on the machines, the network, the disk system, and the data replication factor * Support for pluggable data placement strategies to support things like distribution across data centers that are geographically far apart.

Go To Voldemort

Java is a trademark or registered trademark of Sun Microsystems, Inc. in the United States and other countries. This site is independent of Sun Microsystems, Inc.