Application Types or Tiers

Tier 1 application: An information system that is vital to the running of an organization.
Example: Software used to run Trains, Flights, Operation theaters…etc

Tier 2 application: This will cause interruption, but not critical.
Example: Ticketing software, Classroom software,…etc

Tier 3 application: Nonimportant. Without them, business operations will continue.
Example: knowledge search software, Homework software,…etc

The view will change based on ownership and need of business.
Based on the tier, the architecture will be defined for the following
1. High Availability
2. Scalability
3. Security
…etc

-o-

Business Tiers
https://www.quora.com/What-is-the-difference-between-tier-1-tier-2-and-tier-3-business

DSL – Domain Specific Language – for business analysts

How to provide a tool for business analysts to write logic in simple English? So that it can be used by underlying software without any changes or rewriting.
Answer: DSL – Domain Specific Language

If required we need to develop DSL from scratch for a given domain.
So that BAs can write rules and that should be easy to integrate with code.

wiki: https://en.wikipedia.org/wiki/Domain-specific_language
DZone: https://dzone.com/articles/domain-specific-languages-for-business-application

https://tomassetti.me/domain-specific-languages/
DSL
(Image from https://tomassetti.me/domain-specific-languages/)

how to create your own DSL(Domain Specific Language) in python


https://www.braintreepayments.com/blog/a-dsl-in-5-languages/
https://dbader.org/blog/writing-a-dsl-with-python
https://enotuniq.org/ – Python as DSL

—-
Optimization Algorithms
https://developers.google.com/optimization/


DROOLS Expert Page with DSL
https://docs.jboss.org/drools/release/5.2.0.Final/drools-expert-docs/html/ch05.html

Software Architecture Books

UML
http://www.omg.org/ocup-2/index.htm
http://www.omg.org/ocup-2/study-material.htm
Amazon.com: Buying Choices: OCUP 2 Certification Guide: Preparing for the OMG Certified UML 2.5 Professional 2 Foundation Exam
by Michael Jesse Chonoles
Link: https://www.amazon.com/gp/offer-listing/0128096403

—–
SEI CMM Software Architecture Series Books

Amazon.com: Buying Choices: Software Architecture in Practice (3rd Edition) (SEI Series in Software Engineering)
by Len Bass et al.
Link: https://www.amazon.com/gp/offer-listing/0321815734

Amazon.com: Buying Choices: Designing Software Architectures: A Practical Approach (SEI Series in Software Engineering)
by Humberto Cervantes et al.
Link: https://www.amazon.com/gp/offer-listing/0134390784

Amazon.com: Buying Choices: Evaluating Software Architectures: Methods and Case Studies (SEI Series in Software Engineering) by Paul Clements (22-Oct-2001) Hardcover
by Paul Clements
Link: https://www.amazon.com/gp/offer-listing/B012HTRZD6
Note: As a senior architect we should know how to evaluate/compare and understand existing projects.

Amazon.com: Buying Choices: Documenting Software Architectures: Views and Beyond (2nd Edition)
by Paul Clements et al.
Link: https://www.amazon.com/gp/offer-listing/0321552687

ATAM: Architecture Tradeoff Analysis Method
CBAM: Cost-Benefit Analysis Method

—–
TOGAF framework is very useful.

Amazon.com: Buying Choices: TOGAF Version 9.1
by Van Haren Publishing
Link: https://www.amazon.com/gp/offer-listing/9087536798
Note: Certification helps a lot

Catalog

Click to access TOGAF-V9-Sample-Catalogs-Matrics-Diagrams-v2.pdf

—–
Patterns: These are important to know and easy to re-use

Amazon.com: Buying Choices: Patterns of Enterprise Application Architecture
by Martin Fowler
Link: https://www.amazon.com/gp/offer-listing/0321127420

Amazon.com: Buying Choices: Security Patterns in Practice: Designing Secure Architectures Using Software Patterns
by Eduardo Fernandez-Buglioni
Link: https://www.amazon.com/gp/offer-listing/1119998948

Amazon.com: Buying Choices: Head First Design Patterns: A Brain-Friendly Guide
by Eric Freeman et al.
Link: https://www.amazon.com/gp/offer-listing/0596007124
Note: Look for the latest edition

Amazon.com: Buying Choices: Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions
by Gregor Hohpe et al.
Link: https://www.amazon.com/gp/offer-listing/0321200683

Amazon.com: Buying Choices: Domain-Driven Design: Tackling Complexity in the Heart of Software
by Eric Evans
Link: https://www.amazon.com/gp/offer-listing/0321125215
Note: This is very important when working with specific domains like Finance, Media, Auto, Insurance,…etc

Amazon.com: Buying Choices: Service Design Patterns: Fundamental Design Solutions for SOAP/WSDL and RESTful Web Services
by Robert Daigneau
Link: https://www.amazon.com/gp/offer-listing/032154420X

——-

Data

Amazon.com: Buying Choices: NoSQL and SQL Data Modeling: Bringing Together Data, Semantics, and Software
by Ted Hills
Link: https://www.amazon.com/gp/offer-listing/1634621093

Amazon.com: Buying Choices: Database Design Using Entity-Relationship Diagrams, Second Edition (Foundations of Database Design)
by Sikha Bagui et al.
Link: https://www.amazon.com/gp/offer-listing/1439861765
Note: Buy a similar book

Amazon.com: Buying Choices: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling
by Ralph Kimball et al.
Link: https://www.amazon.com/gp/offer-listing/1118530802

———-

Software Frameworks

What Is a Framework?

Must be working on at least one in each category
RDBMS (Oracle, MS SQL, Postgres, MySQL,…etc)
NO-SQL (MongoDB, MarkLogic,..etc)
Service (Java Services, Spring, Python Flask,…etc)
UI (NodeJS, React, Angular, HTML, JS, CSS)
Reporting (Jasper, Tableau)
Data Warehousing Fundamentals
ETL (Informatica,Apache Nifi,..etc)
OS (Linux Redhat, Ubuntu,…etc
Messaging (JMS, Kafka,…etc)
Cloud (AWS,Cloudera,IBM,MS Azure,…etc)
SEI CMM Process

Optional
BigData (Hadoop,..etc )
—-
Note: If you are coming from a non-software background, please do MS or BS in Computer Science
Or get textbooks from BS course and study in your free time.

Bible for Software Engineers
Amazon.com: Buying Choices: Software Engineering: A Practitioner’s Approach
by Roger S. Pressman et al.
Link: https://www.amazon.com/gp/offer-listing/0078022126

—-
Testing

Amazon.com: Buying Choices: Foundations of Software Testing ISTQB Certification
by Rex Black et al.
Link: https://www.amazon.com/gp/offer-listing/1408044056

Amazon.com: Buying Choices: Learning Selenium Testing Tools – Third Edition
by Raghavendra Prasad MG
Link: https://www.amazon.com/gp/offer-listing/1784396494
Note: Check different books on Selenium
—–
Project Management

Amazon.com: Buying Choices: PMP Exam Prep, Eighth Edition – Updated: Rita’s Course in a Book for Passing the PMP Exam
by Rita Mulcahy
Link: https://www.amazon.com/gp/offer-listing/1932735658

Amazon.com: Buying Choices: Essential Scrum: A Practical Guide to the Most Popular Agile Process (Addison-Wesley Signature Series (Cohn))
by Kenneth S. Rubin
Link: https://www.amazon.com/gp/offer-listing/0137043295

—-

Requirements / UX

Amazon.com: Buying Choices: Lean UX: Designing Great Products with Agile Teams
by Jeff Gothelf et al.
Link: https://www.amazon.com/gp/offer-listing/1491953608

Amazon.com: Buying Choices: Software Requirements (3rd Edition) (Developer Best Practices)
by Karl Wiegers et al.
Link: https://www.amazon.com/gp/offer-listing/0735679665

Serverless Architectures

https://martinfowler.com/articles/serverless.html

Backend as a Service or “BaaS”
Function as a Service or “FaaS” (Example: http://openwhisk.incubator.apache.org/)
AWS Lambda: https://aws.amazon.com/lambda/

Advantages:
1. Easy to develop and deploy light weight systems.
2. Good for systems, which use less CPU and low usage. Saves money on infrastructure.

Disadvantages:
1. Enterprise scale high throughput applications need to pay more money to IBM/Amazon/.etc.
2. We need to pay for each task
1. CPU
2. Data storage
3. Disk Space
4. Total number of calls
Almost it becomes like mainframe systems.

-o-

Software Architect Catalogue

IBM Cloud Catalog
https://console.bluemix.net/catalog/

Amazon Catalog
https://us-west-2.console.aws.amazon.com/console/home?region=us-west-2#
https://aws.amazon.com/documentation/

Architectural Patterns
https://en.wikipedia.org/wiki/Architectural_pattern

Software Security Patterns
https://en.wikipedia.org/wiki/Attack_patterns
https://en.wikipedia.org/wiki/Security_pattern
https://www.owasp.org/index.php/Security_by_Design_Principles
https://dzone.com/articles/9-software-security-design

Integration Patterns
http://www.enterpriseintegrationpatterns.com/patterns/messaging/
http://camel.apache.org/enterprise-integration-patterns.html

Software Architecture
SEI CMM
Oracle
TOGAF

-o-

What Software Architect Do?

Negotiate for timelines and resources
Communicate between project stackholders
Evangiliage best practices
Identify and Mitigate Risks
Be strong in Technology/Software fundamentals (Just not white paper knowledge)
Be proficient in Agile methodologies

Software Selection and Evaluation

Quantitative Methods for Software Selection and Evaluation
ftp://ftp.cert.org/public/documents/06.reports/pdf/06tn026.pdf

A Process for COTS (commercial off-the-shelf) Software Product Evaluation

Click to access 03tr017.pdf

Text Processing

Text Processing Architecture

Open Search Text Server
http://www.opentext.com/what-we-do/industries/legal/legal-content-management-edocs/opentext-search-server-edocs-edition

Noggle

Cognitive Search Engine: How To Overcome The Knowledge Disconnect

http://blogs.forrester.com/mike_gualtieri/17-06-12-cognitive_search_is_the_ai_version_of_enterprise_search
Cognitive Search Is The AI Version Of Enterprise Search

https://www.elastic.co/guide/en/elasticsearch/guide/current/index.html

MicroServices

Micro Services is a quick way to serve UI needs.

Microservices trends 2017: Strategies, tools and frameworks


Micro Services Comparison

Python and Flask
https://stackoverflow.com/questions/10938360/how-many-concurrent-requests-does-a-single-flask-process-receive

Micro Services – Performance Comparison
https://cdelmas.github.io/2016/06/20/Performance-of-Microservices-frameworks.html

References:
http://microservices.io/
https://apigee.com/about/blog/cto-musings/api-best-practices-microservices
https://www.mulesoft.com/webinars/api/microservices-architecture

Address following while choosing Micro Services

Domain Driven Design

Performance
Security
Concurrency
Availability of Engineers
Easy to install/maintain/monitor (Dev Ops)
Easy to develop (Developers)
Session handling
Testing
Debugging
Logging

Commercial Support when needed
Future of Project
License
Support in Amazon AWS and Microsoft Azure

Moving data from system A to system B

This is pretty old age problem to be solved in majority of projects.

History: It comes under Flow based programming: https://en.wikipedia.org/wiki/Flow-based_programming

Scope:
Our focus is to move data from system A to system B. Only Extraction and Loading. Not much about Transformations.

———————————
Option 0: Hand coding in Python / Java / PERL …etc
This is good for small sets of data. Also good for POC.
Not suggested to push to production without failover, managing jobs, scheduling jobs,…etc

———————————

Option 1: If system is heavy and need robust solution, better to go with Apache NiFi
https://nifi.apache.org/

The US National Security Agency open-sourced its Niagrafiles, or NiFi, data-flow software.
https://en.wikipedia.org/wiki/Apache_NiFi

How to enable security for NiFi?
http://ijokarumawak.github.io/nifi/2016/11/15/nifi-auth/

How to write Java code for NiFi and other languages?
https://community.hortonworks.com/questions/75977/run-java-code-in-apache-nifi.html

Other directory with date suffix examples
https://community.hortonworks.com/questions/44215/is-there-a-processor-in-nifi-that-can-create-many.html

Commercial support available:
https://hortonworks.com/apache/nifi/

Versioning available:
https://community.hortonworks.com/questions/61475/nifi-workflow-version-control-deployment.html

Externalizing variables possible.
Easy to move configurations from QA to Prod

We can slim down the system to minimize its foot print
https://community.hortonworks.com/articles/32605/running-nifi-on-raspberry-pi-best-practices.html

NiFi support Hadoop HDFS
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/

Alternatives:
http://storm.apache.org/index.html
But Storm objective is different.

———————————

Option 2: Use streaming API of Apache Spark
http://spark.apache.org/docs/latest/streaming-programming-guide.html
Sqoop Vs Flume
https://www.dezyre.com/article/sqoop-vs-flume-battle-of-the-hadoop-etl-tools-/176

———————————

Option 3: If you are using CDAP, better to use Hydrator to generate JSON and use it.
Bit more study required around metrics, management and tracking these jobs.
http://docs.cask.co/cdap/4.1.0/en/developers-manual/pipelines/developing-pipelines.html

https://github.com/cdap-guides/cdap-etl-guide
http://blog.cask.co/2016/06/bringing-relational-data-into-data-lakes/

Better to stay away from CDAP stack. There is not much public acceptance. No response on their forums. If we ask question, they wont respond. If we call them, they will ask us to buy their support/consulting hours. Nothing wrong in this. But we can’t afford.
http://cask.co/support/

We can check their poor support in their groups
https://groups.google.com/forum/#!forum/cdap-user

———————————

Option 4: Pentaho Kettle
http://wiki.pentaho.com/display/BAD/Kettle+on+Spark
It is not ready for Big Data as on March 2017
Good for small java enterprise projects (Coding required with Kettle API). Used in the past.
http://javadoc.pentaho.com/kettle/ – Java documentation quality is not good.
https://community.hortonworks.com/questions/24014/what-is-the-difference-between-nifi-and-kettle.html

———————————

Option 5: Commercial products

https://www.talend.com/
http://www.robertomarchetto.com/talend_studio_vs_kettle_pentao_pdi_comparison

Home

http://www.alteryx.com/ is good product and it is having better support with https://www.tableau.com/ (BI/Analytics)

———————————

Option 6: Spring Batch

If we want to minimize number of servers, we want minimal solution, Spring Batch is good one.
But it needs continuous maintenance when there is change in Spring / Java version.

Spring Integration: http://docs.spring.io/spring-integration/reference/html/ftp.html
Spring batch partitioning: https://keyholesoftware.com/2013/12/09/spring-batch-partitioning/
Spring Batch Reference: http://docs.spring.io/spring-batch/reference/html/index.html
Spring Batch UI: http://docs.spring.io/spring-batch-admin/reference/reference.xhtml

———————————

Conclusion:
Use Apache NiFi as much as possible. Works well in production and also quick in POCs

As on March 11 2017: https://groups.google.com/forum/#!topic/cdap-user/hiuUP3jIxNs
CDAP Hydrator is not in a position to compete with Apache NiFi
-0-

#apache-nifi, #etl