Data and Databases

Databases

What is Data?

In the era of digitization, we interact plethora of applications for day-to-day tasks and thus generate large amounts of data. Data can be a fact, information, or statistics generated, processed, and analyzed for various purposes. It can be categorized into structured and unstructured data. Structure that follows a schema and a format like details about cars that can be stored in a table of rows and columns or unstructured like video, image, etc. Any piece of data thus has the capability to give a competitive advantage to an individual or an enterprise and thus requires it to be protected and used responsibly.

Data

What are Databases?

Databases are structured collections of data organized in a manner that makes storage, retrieval, and management of information quick, and effective. Databases are required to serve many purposes for enterprises and play a fundamental role in modern technology and information management. A few key uses are as,

  • Information storage and security.
  • Data analysis
  • Application development
  • Data integrity
  • Compliance and auditing
  • Backup and recovery
Databases

Databases are majorly categorized into relation and non-relational. Relational databases follow fixed schema and store data in rows and columns. Non-relational databases do not have fixed schema. Relational databases are good for transactional processes and are further classified into OLAP and OLTP.

OLAP is online analytical processing and OLTP is online transactional processing. Both OLAP and OLTP follow similar data structures but different data storage approaches. OLAP relational databases store data in rows and OLTP store data in columnar form. While Non-relational databases also called NoSQL- Not only SQL store data in key-pair form.

Databases Type

The following are relational database services in

  • AWS – RDS
  • Azure – SQL Database
  • GCP – Cloud SQL

The following are non-relational database services in

  • AWS – Dynamo DB
  • Azure – Cosmos DB
  • GCP – Cloud Firestore

We read what databases are and their importance, but choosing the appropriate database is very important. Few of the below criteria can make your job easy let’s cover them.

Selection on the basis of transactions.

Is your requirement storing transactional data if the answer is yes then go for a relational database. The number of transactions per second will narrow down the selection. If the need is thousands of transactions per second then go for OLAP( Big Query)

If a need is to process millions of transactions per second then go for OLTP- Columnar type database (SQl, Cloud SQL, RDS etc.).

Row storage
Columnar storage

Selection on basis of Consistency.

If the requirement is for strong consistency, then choose the relational database and if the requirement is eventual consistency with a few seconds lag, then choose a non-relational database.

Selection on the basis of Availability & Durability

Availability is the measure of service whenever it is required to be accessed it is available. While durability is, your data is unchanged for a year or hundred years and is available to be used. It is usually 9 9s or 11 9s.

Selection on the basis of RTO & RPO.

RTO is the recovery time objective; it indicates how much time the application or service can be brought back to normalization. It can be in hours, minutes, or seconds and the cost also goes up accordingly.

RPO is how much data can be lost when the database or service goes down, and enterprises do not like losing data. Again, this can be in minutes, seconds, or hours.

Besides the above criteria’s, there are many more that can be considered as per the use cases. Hope the article has covered important points about databases and understanding the crucial they play in managing information, and data that is becoming the new oil in the digital landscape.

You can share your views by writing to windroidvp@gmail.com