MM 001

“Your people will watch your feet, not your lips” said Tom Peters.

If you can’t start a meeting on time, they will not be able to finish a project on time!

 

Docker data volumes and sharing data

Docker containers are temporary.

You can experiment this by starting a basic ubuntu image and create a test file or dir. The soon you exit the container all your changes will disappear.

Let’s try to bring that container up again:

Thankfully Docker provides the solution to keep data persistent.

Docker Data Volumes

Docker data volumes are used when:

  1. You want to persist data, even through container restarts
  2. You want to share data between the host filesystem and the Docker container
  3. You want to share data with other Docker containers.

Data persist

There’s no way to directly create a “data volume” in Docker, so instead we create a data volume container with a volume attached to it. For any other containers that you then want to connect to this data volume container, use the Docker’s –volumes-from option to grab the volume from this container and apply them to the current container.

Le’s create a data volume container to store our volume:

This created a container named datacontainer based on ubuntu image and in the directory /data.

If we reiterate our initial test with –volumes-from flag, anything we write to /data directory into current container will be saved to the /data volume of our datacontainer.

Now rerun the container and check if /data/my_file is persisted.

You can also create as many data volume containers as you’d like but you are restricted to choose the mount inside the container (/data in our example).

Sharing data between containers – shared volumes

There is the need to share data between host and container itself. Docker gives you the option to run a container and override one of its directories with the contents of a directory on the host system.

Let’s imagine you’re running your application and you want to keep the logs out of container.

We set up a volume that links the /var/log/nginx directory from the nginx container to ~/nginxlogs on our host.

If you make any changes to the ~/nginxlogs folder, you’ll be able to see them from inside the Docker container in real-time as well.

Additional links:

  1. https://docs.docker.com/engine/userguide/containers/dockervolumes/

MongoDB – Installation and basic commands

MongoDB is a document-orientated database, a NoSQL database with a JSON-like documents and dynamic schema:

  • Tables are collections
  • Rows are documents
  • No schema

The advantages of using documents:

  • Documents (objects) correspond to native data types in many programming languages.
  • Embedded documents and arrays reduce need for expensive joins.
  • Ability to perform dynamic queries on documents using document-based query language.
  • Conversion of objects to database objects is not needed
  • High performance – uses internal memory to shore working sets of quick data access.

Advantages of MongoDB vs RDBMS

  • Schema less – the document is flexible compared to “table” structure including nullable fields from RDBMS. One collection can hold even different different documents. Number of fields, content and size of the document can be differ from one document to another.
  • No complex joins – as data is usually stored with the same document. We’re going to discuss about embedding related data in document and how MongoDB is dealing with the lack of relations.
  • Ease of scale-out – MongoDB is easy to scale.
  • Conversion / mapping of application objects to database objects not needed. See https://en.wikipedia.org/wiki/Object-relational_impedance_mismatch.
  • Document Oriented Storage: Data is stored in the form of JSON style documents. Everybody is talking JSON there days: client, server and database.
  • Atomic – in-place updates: $inc, $set, $push, $pop, $rename, $bit. No need to deal with the complexity of transactions.

Where should use MongoDB?

  • Big Data
  • Content Management and Delivery
  • Mobile and Social Infrastructure
  • User Data Management
  • Data Hub

Installation

MongoDB is a cross-platform so major OS are covered together with the option to install manually or via setup procedure (msi, apt-get, homebrew).

Below we’re going to add the steps to manually install mongoDB on Windows platform. You can find the latest stable release here.

For Debian/Ubuntu installation, please use this link.

For the installation of mongodb I recommend to use a custom folder (eg.d:\mongodb) Also is recommended to open cmd.exe as admin and then create data\db and log folders.

You could simplity and add some details into mongo.config to adjust the config of mongod service:

For the manually binary installation it’s better to start the mongod as a windows service:

Mongo Shell

Let’s start the shell by using mongo.exe.
Mongod.exe
is the mongoDB server.

Below some basic mongodb shell commands:

This is the first article on a series of mongoDB related ones.

Additional links:

1.https://www.mongodb.com/presentations/mongodb-chicago-benefits-using-mongodb-over-rdbms

2.https://dzone.com/articles/when-use-mongodb-rather-mysql

Nosql vs SQL

Relational databases (RDBMS) were not designed to cope with the scale and agility challenges that face modern applications, nor were they built to take advantage of the commodity storage and processing power available today.

The Next Generation of Databases is addressing some of the points: being non-relational, distributed, open-source and horizontally scalable.

The movement began early 2009 and is growing rapidly. Often more characteristics apply such as: schema-free, easy replication support, simple API, eventually consistent / BASE (not ACID), a huge amount of data and more.

Why is NoSQL exploding? The easy way to explain this is flexibility and power. So do we want to drop 40+ years of RDBMS ? No, more than that the community now translates “nosql” mostly with “not only sql“.

NoSQL Database Types:

According to http://nosql-database.org/  there are 225+ NoSQL solutions available.

The NoSQL market is expected to be $3.5 billion by 2018, according to a Market Research Media Ltd. In the meantime the RDBMS market is around $26 billion (n.r. 2013) with about 9 percent annual growth so by 2018 the RDBMS market will reach $40 billion.

In the future posts I’m going to focus on MongoDB database, covering from basic aspects and DevOps to more advanced topics.

References:

  1. http://www.wired.com/insights/2013/09/the-future-of-enterprise-data-rdbms-will-be-there/
  2. http://nosql-database.org/
  3. https://www.mongodb.com/nosql-explained
  4. http://slamdata.com/blog/2014/07/07/Henry-Ford-NoSQL-data-and-the-future-of-analytics.html