Software Architect and Software Architecture

What is Software Architecture? Software Architecture  is the structure of a software system, it's fundamental properties and the principles of its design and evolution. Structure : It is about identifying the architecturally significant pieces of a system and defining the relationship between them. Properties : It is about the functional properties of a system and its quality properties like performance, scalability, security, etc. Principles : It is about understanding the principles behind its design. An understanding that allows the system to evolve in a consistent and logical way without introducing unnecessary complexity. What is the importance of Software Architecture? The need for Software Architecture is best summarized in the following statement: "Software architecture represents a system's earliest set of design decisions. These early decisions are the most difficult to get correct and the hardest to change later in the development

REST - Representational State Transfer

Rest Architecture Style Representational State Transfer Year 1996 Berners-Lee writes that the "Web's major goal was to be a shared information space through which people and machines could communicate." Year 2000 Hypermedia was chosen as the user interface because of its simplicity and generality. Hypermedia, an extension of the term called hypertext, is a non-linear medium of information that includes graphics, audio, video, plain text, and hyperlinks. A non-linear medium is any medium that can be navigated through random access. The rapid growth of the Internet and the consequently deployed architecture had significant limitations in its support for extensibility, shared caching, and intermediaries, which made it difficult to develop ad-hoc solutions to the growing problems. The challenge was to introduce a new set of functionality to an architecture that was already widely deployed, and how to ensure that its introduction does not adversely impact,

Software Architecture

This post is an abstract of some chapters in the book 'Software Systems Architecture' by Eoin Woods and Nick Rozanski Software Architecture Definition Software elements that you need to specify and or design in order to meet a particular set of requirements, plus the hardware required to run those software elements on Key Parts of the definition Structure - System’s elements, pieces that can be constructed, and their relationships Static structure Software classes, Relational entities, Network, Hardware etc Dynamic structure System response to an external stimulus Information flow, parallel/serial execution of tasks, effects on data (create, update, delete) Properties - Fundamental properties of a system Externally visible properties Functional behavior Quality properties Scalability, Performance, Security etc Principles - of its design and evolution Fundamental beliefs, approach or intent - that guides the architecture Conventions that

Some simple questions that may need some thinking

In this blog post, I have put down some questions that are kind of random questions but they are important questions whenever you develop any software. The answers to these questions depend on the context and require a lot of experience and knowledge to make a good judgment. Where should I store media images for my web application?  Inside web application On a web server  In a database  On a cloud  Where should I write an application log messages? In a local file  Syslog RDBMS NoSQL database Where should a client store authentication token? Cookies Local Storage What kind of authentication mechanism should I use for my web application? Stateless session tokens Session Ids What should be the format of log file messages? Free text string format Key-Value Pairs string format JSON What should I use for notifying a service for some action?  A message queue A table in a shared database Can I use a single load balancer to handle the load of hundred

Big Data, Streaming Data - ETL Analytics Pipeline


Apache Hadoop Ecosystem

Hadoop HDFS - 2007 - A distributed file system for reliably storing huge amounts of unstructured, semi-structured and structured data in the form of files.  Hadoop MapReduce - 2007 - A distributed algorithm framework for the parallel processing of large datasets on HDFS filesystem. It runs on Hadoop cluster but also supports other database formats like Cassandra and HBase.  Cassandra - 2008 - A key-value pair NoSQL database, with column family data representation and asynchronous masterless replication.  HBase - 2008 - A key-value pair NoSQL database, with column family data representation, with master-slave replication. It uses HDFS as underlying storage.  Zookeeper - 2008 - A distributed coordination service for distributed applications. It is based on Paxos algorithm variant called Zab.  Pig - 2009 - Pig is a scripting interface over MapReduce for developers who prefer scripting interface over native Java MapReduce programming.  Hive - 2009 - Hive is a SQL interf

Big Data After The Internet

Till 1995 most of the people did not know about the internet. It was hard to use, till the Netscape browser arrived and its famous IPO happened. The arrival of Netscape meant anyone could create material and anyone with a connection could view it. Internet's popularity resulted in mushrooming of websites like AOL, MSN, Yahoo, CNN, Napster and so many more. They  provided free information sharing services like emails, chats, photograph sharing, video sharing, blogging, news, weather, music, games etc. These sites were generating, collecting and sharing an enormous amount of data, for the people all over the globe. There were, of course, new generation e-commerce companies like Amazon and eBay that also contributed to the overall information available, but sharing of information was not at the core of their strategy.   Why this phenomenon of information sharing noteworthy? There are two good reasons:  The data on the Internet was freely available to everyone on the Interne