Good combination to manage
Code check in to Git–>Git checkout–>Build Docker–>use GoCD–>Deploy to multiple environments.–> Manage environments with Rancher.
Acknowledge Write – 1 – DB Setting – False – Fast Response – Small error of missing write
Acknowledge Write – 1 – DB Setting – True – Slow Response – No Error
Unacknowledge Write-0 – Not waiting for server to respond.
In replicated environment, there are many more variables.
use case 1: Write to DB, Due to network error, system didn’t responded. Data was written to disk
Option 1: Based on case, due to failure, try to write one more time.
Option 2: In sensitive case, read data to make sure that data was written and act acordingly.
To select Primary, we need to have odd number of servers.
Types of Replica Set Nodes
1. Regular Node – Primary or Secondary
2. Arbiter Node – Only for voting purposes. No Data on it.
3. Delayed Node – It can’t become primary node. This is one hour late on updates with compare to other nodes.
4. Hidden Node – It can’t become primary node. Used for Analytics
All nodes can participate in election
Always writes/reads goes to Primary
Application can read from secondary
During the time when failover is occurring, can writes successfully complete?
mongos is router..takes care of distribution…
Sharding is used for horizontal scalability
HBase with Java API: https://dzone.com/articles/handling-big-data-hbase-part-4
HBase web site, http://hbase.apache.org/
HBase wiki, http://wiki.apache.org/hadoop/Hbase
HBase Reference Guide http://hbase.apache.org/book/book.html
HBase: The Definitive Guide, http://bit.ly/hbase-definitive-guide
Google Bigtable Paper, http://labs.google.com/papers/bigtable.html
Hadoop web site, http://hadoop.apache.org/
Hadoop: The Definitive Guide, http://bit.ly/hadoop-definitive-guide
Fallacies of Distributed Computing, http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing
HBase lightning talk slides, http://www.slideshare.net/scottleber/hbase-lightningtalk
Sample code, https://github.com/sleberknight/basic-hbase-examples
What is hive?: Hive is a data warehousing infrastructure based on Hadoop
What is Hbase?: Its a distributed, versioned, column-oriented NoSQL data store, modeled after Googles Bigtable. used to host very large tables — billions of rows *times* millions of columns.
What is hadoop?: Hadoop provides massive scale out and fault tolerance capabilities for data storage and processing on commodity hardware using map-reduce programming paradigm.
File Name: calc.py
#This runs as service. from flask import Flask app = Flask(__name__) @app.route("/sum/<int:a>/<int:b>/") def sum(a,b): sum = int(a) + int(b) #print "sum is", sum return str(sum) if __name__ == "__main__": app.run(debug=True,host='0.0.0.0')
File Name: requirements.txt
File Name: Dockerfile
# Use an official Python runtime as a base image. Get this version from local system with >python –version
# Set the working directory to /app
# Copy the current directory contents into the container at /app
ADD . /app
# Install any needed packages specified in requirements.txt. These are packages used in python script.
RUN pip install -r requirements.txt
# Make port 5000 available to the world outside this container
# Define environment variable
ENV NAME calc
# Run calc.py when the container launches
ENTRYPOINT [“python”, “calc.py”]
>docker build -t calc .
>docker run -p 5000:5000 -it calc
Note: Remove debug=true and run in background mode in real time.
1. This makes easy to integrate with docker.
2. Load balancing need to be taken care separately.
3. Scaling is easy based on demand
4. Easy to deploy/redeploy patches
All advantages of micro services.
Flask is a microframework for Python based on Werkzeug, Jinja 2 and good intentions. And before you ask: It’s BSD licensed!
Latest Version: 0.12
In docker we prefer to run python code/functionality as service.
During this time we need flask.