Project Voldemort: Features, Comparison with other Databases and Application

AbstractTo start with, everyone is aware of how technology is at its peak. This technology gets a new feature every new minute. Although Relational databases do not satisfy all its feature since it is structured as well as cannot grow or shrink in accordance to storage. So, in order to overcome this limitation, the databases should be able to accommodate huge datasets also not only proper structured but it should also accommodate semi-structured or non-structured datasets. This is when databases like project Voldemort are significant.In this review paper, we would discuss about a non-structured distributed database namely Project Voldemort which follows key value pairing. In this article, our prime focus will be on the evolution of Voldemort and the current scenario of it.Keywords: key value storage system, Comparison to relational database, QuickStart, Pros and Cons, QueriesProject Voldemort is an open source database it follows amazon’s key value pairing. There are many distributed databases but not every database has data replication property.Project Voldemort is developed by LinkedIn. Its initial release was in 2009 whereas it was released as a stable database product by 2017.The source code is available under the Apache 2.0 license. Anyone can actively fix the bugs report and also generate new updates Project Voldemort is written in java and it is available in English worldwide. Name Voldemort is derived from the fictional character from the famous harry potter. LinkedIn uses project Voldemort for high scalability.The prime features of project Voldemort are as follows:Data can be automatically replicated or copied to various servers. This is helpful as it saves time as code as wellData can be divided and only the needed part of the entire database can be sent to the required server. Hence each server will have a subset of the entire huge database which is also used for Load BalancingEach data item is versioned that is it is looked optimistically in order to avoid any failure cases. Also, it binds the data integrity to avoid failures. Each data node is independent to avoid integral failure point also known as central failure point. Voldemort has the property that when a server fails load will distribute equally over all remaining servers in the cluster.Pluggable serialization includes rich keys and their values and tuples with field names, it checks data against an expected schema which avoid severe errors. It supports pluggable data placement strategies which helps in distribution of data across various data centres that are difficult to distribute considering geographical regionVersioning technique is just a simple step of optimistic locking. We store a unique counter or “clock” value with each piece of data and only allow updates when the update specifies the correct clock value. Its helps in high efficiency which works well in centralized but this feature sometimes is not compatible with distributed since it proceeds to data replication and data redundancy as well.Voldemort is not a relational Database nor it an object-oriented database. Hence it does not satisfy ACID properties and it can also not attempt to map the object reference graphs nor does it show the abstraction like document orientation.In comparison to relational database, project Voldemort is huge, persistent, distributed and also contains fault tolerant hash table. Project Voldemort supports horizontal scaling and has much higher availability but not to forget this causes great loss of convenience.Project Voldemort does not have caching tier, since it combines the memory caching with the storage system. Hence, separate caching tier is not required because storage system is all enough.Project Voldemort does both read operation as well write operation in a horizontal scalable manner hence differs from other relational databases.One major difference of Project Voldemort in accordance to any other relational database is Data Portioning, allows for cluster expansion and shrinking as well without rebalancing all data.Voldemort database also practises unit testing, since storage layer is mockable.To enable high performance of this distributed database it allows only very simple key-value data access.The important part of design is to remember that both keys and values can be Complex objects which can include maps as well as lists.The only supported queries which are effectively executed are:value = store.get(key)store.put(key,value)store.delete(key)Introductionstore.delete(key)