NoSQL database, also called Not Only SQL, is
an approach to data management and database design that's useful for very large
sets of distributed data. NoSQL, which encompasses a wide range of
technologies and architectures, seeks to solve the scalability and big data performance issues
that relational databases weren’t designed to
address. NoSQL is especially useful when an enterprise needs to access and
analyze massive amounts of unstructured data or data that's
stored remotely on multiple virtual servers in the cloud.
The approach taken by the two of
today’s best-known NoSQL systems: CouchDB and MongoDB was inspired by the Lotus founder Mitch Kapor
and Microsoft chief architect Ray Ozzie who worked together to create a
collaboration and personal productivity tool called Lotus Notes way back in
1980.
Like Notes, these database systems
store information not as normalized relational tables, but as documents in a
rich self describing structure. Both use a variant of JavaScript Object
Notation (JSON) to store these documents. JSON is somewhat like XML,but offers
more compact storage and lower processing overhead. Document databases
primarily appeal to developers for the very reason that relational databases
don’t.
Contrary
to misconceptions caused by its name, NoSQL does not prohibit structured query
language (SQL).
While it's true that some NoSQL systems are entirely non-relational, others
simply avoid selected relational functionality such as fixed table schemas and join operations.
The most popular NoSQL database is Apache
Cassandra. Cassandra, which was once Facebook’s proprietary database, was
released as open source in 2008. Other NoSQL implementations include SimpleDB,
Google BigTable, Apache Hadoop, MapReduce, MemcacheDB, and
Voldemort. Companies that use NoSQL include NetFlix, LinkedIn and Twitter.
However, IBM, Oracle, and Microsoft
are seeing an unprecedented challenge to their SQL licensing money machines
thanks to enterprise data workloads that are getting more diverse, larger, and
more complex. In those cases, IT will trade stringent data consistency for low
latency and a favorable cost versus performance balance.
NoSQL is ideal also feature
staggering volumes and velocities of data; we’re talking, at the low end, a few
tera bytes of records, with 10,000 or 20,000 concurrent inserts per second,
minimum. Once you’re at that level, IT needs to focus on platforms that can
serve as the infrastructure layer—capable of scaling out, without costing a
fortune.
Under heavy volume, NoSQL can be as much
as 100 times as fast as SQL for such workloads, according to the tests performed
by the NoSQL open source community. Oracle, on the performance of its NoSQL
offering, demonstrated excellent scalability, throughput, and latency; one test
showed close to 100,000 inserts per second and 3.7 milliseconds average latency
for 360 client threads. As for costs, conventional SQL (hardware and software
included) ends up about five to 10 times as costly as a typical NoSQL setup.
No comments:
Post a Comment