Saturday, November 24, 2007
I have imported all my previous posts into my new blog. I found Word Press easy to work with yet flexible and powerful enough to customize the way I want. I am using blueprint as my theme and found it to be very elegant. It's built on the Blueprint CSS framework. If anybody is interested you can download it from here.
Many thanks to the Blogger team for improving blogger over the last few months and I thorughly enjoyed my blogging experiance here. I like to be in control of my everyday life, including blogging. Hence my move to my own thing.
Friday, November 16, 2007
- You need to understand and take complete control of your architecture. Read my post on Architecture is your responsibility for more thoughts on this.
- You need to have scalability in mind right from the beginning. Trying to achieve scalability later can be time consuming and very costly. Quoting from Werner Vogels post
- Transactional scaling is just one dimension. You need to think about Scalability of data, operational,deployment, power ..etc. This is a minimalistic set. Try to figure out what dimensions are important to your organization.
- All scalability dimensions are related and impacts each other. Any dimension ignored can could evolve into a problem for your application
- Prefer vertical over horizontal scaling. Vertical scaling is better for your vendors and is not a viable long term strategy. There is so much you can get by increasing memory, CPU etc..
Why is scalability so hard? Because scalability cannot be an after-thought. It requires applications and platforms to be designed with scaling in mind, such that adding resources actually results in improving the performance or that if redundancy is introduced the system performance is not adversely affected. Many algorithms that perform reasonably well under low load and small datasets can explode in cost if either requests rates increase, the dataset grows or the number of nodes in the distributed system increases.
Usually measured in TPS and is a traditional indicator for application performance.
- Keep asking the question "How long can the business survive?" based on,
- Time-to-live on current resources.
- Time-to-live on maximum plausible configuration.
- These metrics should be taken regularly to anticipate possible production bottle necks and identify issues before they become a crisis.
How well does your data scale? Think about,
- Functional Decomposition, group data by logical relationships, business importance,transactional volumes etc.
- Think about partitioning data (sharding).
- Is all data equally important? prioritizing your data and allocating resources accordingly will help you scale better.
How hard is it to run your software? Operational scalability is a software problem and you need to think about operational concerns right from the beginning. Pay attention to,
- Logging metrics, Monitoring.
- Controlling/updating/tuning live apps without disrupting traffic.
You need to design/architect your systems while keeping the following in mind,
- Ability to do incremental roll outs (and rollback if there are problems) without disrupting live traffic.
- Managing component dependencies during deployment without disrupting live traffic.
- Your architecture shouldn't assume or decouple itself to any hardware,network topology or data center topology. This allows you to take advantage of new hardware, network topologies ..etc without significant changes.
Power can be a limiting factor in a data center and may put bounds on transactional scaling.
- How efficient is your software?, wasted clock cycles == wasted watts.
- Consider vitalization for best utilization of your hardware resources.
Some good tips I managed to note down
- Run old and new schema parallelly and then take out the old schema after a while when you gain enough confidence.
- Prioritize services, willing to take a hit on certain services over others.
- Incremental rollouts is a very good way to roll out new features while managing risk and also prevents taking the system offline.
- Schedule deployment during working hours instead of weekends/nights as this enables your developers, support staff to attend to problems while they are alert and without being distracted by non work issues.If you do incremental rollouts this is possible as you are not disrupting traffic.
Wednesday, November 14, 2007
It’s(Roy's dissertation) not really primarily about REST; rather, it’s about principled design. Much of his dissertation is about architectural elements, principles, constraints, properties, and the relationships between them all. REST is used as a very clear example in chapter 5 of what principled design is all about.Why can't we use a principled design approach when we do SOA or for that matter any other architecture?
When we add contraints or relax constraints we induce certain properties in our architecture. As an architect you make an educated desision as to what constraints make sense in your environment and what doesn't. When designing systems don't we go through decisions like "should we make these services stateless or statefull ..etc" during our design meetings ?
I think in what ever system you design as an architect you should think through and note down the constraints you want to impose on your system. This will provide a proper foundation to your system and an excellent guideline to your developers which will clearly communicate the desired goals of your system. Then later on when somebody else wants to relax any of these constraints or add more constraints they already have a guideline and can see how the "relaxing of an existing constraint" or the "addition of a new constraint" can impact the overall system.
REST is just a name coined by Roy to identify a set of constraints, and they are not the only constraints, nor the best combination of constraints in every situation. As Steve mentioned Roy spends the first few chapters providing an excellent analysis about "architectural elements, principles, constraints, properties, and the relationships between them all" and of course the value of a principled design approach.
To me the value of Roy's thesis goes beyond REST and I hope most people would realize the same.
Thursday, November 01, 2007
- One volcanic eruption can contribute to global warming more than what humans can do in an year
- The earth was actually warmer than what it is now.
All I know is that humans haven't really figured it out yet. We think we do and try to mess around with nature, but I don't think we are even close at guessing/figuring out the real situation.
Monday, October 29, 2007
Dan's comments on architecture was very insightful (I want to write a separate post on what I learned from his talk at the recently concluded Colorado Software Summit). The underlying truth of everything he said, was that they understood and took responsibility for the architectural decisions they made, instead of relying on some vendor to provide direction and overall vision.
There is no vendor out there, that can provide you with some ESB that can magically transform your enterprise into a SOA platform or some messaging middleware that can help you scale your enterprise to whatever limits you want unless you know what you are doing and take ownership of the overall vision. You need to understand the overall architecture, make decisions and take responsibility for them. An ESB or a messaging middleware are merely a bunch of tools that help you get there or in other words they are just a means to an end not the end itself.
There is no framework out there that can force architectural decisions on your solutions that you are not willing to make yourself. During my REST in peace talk, there was a surprising number of folks who asked me about a framework that can help them develop RESFTful services. Guess what, the road to a RESTful approach (or for that matter any architectural style) starts with the architectural decisions you make (the way you think/design your services) and not with some framework where you have to flip a switch or use a bunch of annotations that turns your code into a RESTful service. That is precisely why the contract first approach is recommended over a code first approach when you do web services. You need to think about how you design your service first and then use some framework to generate your WSDL and your code from that, not the other way around.
We all remember how the EJB mania deceived us. Many companies paid millions of dollars to App Server vendors to solve their architectural problems. The whole notion of "you only need to think/write the business logic, and we will take care of the remoting, transactions, persistence, scalability ..etc" was just an illusion. Neither did it preclude people from making extremely stupid architectural decisions nor did it provide anymore scalability than the simple tomcat web server for most of the use cases.
You need to think carefully about the architectural decisions you make and understand the impact it has on the overall goals/vision of your enterprise. You need to be aware of operational, load, managerial and geographical scalability from day one. You cannot offset your lack of architectural vision by using some framework, product or vendor. It will only make your vendor happy, but not your customers.
Competition provides choice and facilitates continuous growth and innovation in open source solutions. It drives a community to be more responsive and responsible towards it's end users. This results in better support in the form of fixing bugs or answering questions on the list. Bcos if you are not growing or innovative or if you are not responsive or responsible towards the end users then they will look elsewhere. One could argue that there are companies that provide support. However one should not forget that these companies are built on top of the community and rely heavily on the community for it's success. And any fixes/patches they make usually go upstream. Companies that don't usually have problems and fade away.
Sometimes you would find that some community members are unhappy with the current direction of a project and they go ahead and form another project. The difference in direction or focus is perhaps an integral part of the evolutionary process. Some of these projects eventually create a company behind it. One could also argue that these companies fragment a community and promote competition. As long as this competition is both ethical and within the norms of standard industry practice, then the end users benefit from it. Why?? Bcos these companies will drive innovation, creativity and quality of the solutions they support, as their business model is based on it.
Therefore some form of competition that is ethical (not mud slinging or cut throat competition) is healthy for making open source a viable option in enterprise software. The process of evolution will weed out inferior solutions and ensure the survival of the fittest. However this should not be based on how much marketing muscle a project/company behind it has, but rather be based on the community aspect and technical merits.
Saturday, October 13, 2007
It is important to note that with any exchange type, a message can be matched with more than one queue if two or more queues are bound with the same routing criteria.
The exchange does a direct match between the routing key provided in the message and the routing criteria used when a queue is bound to this exchange.
(Click on image)
The most common use case is to bind the queue to the exchange using the queue name. However it is important to note that you could use any value for the binding.
A broker is required to provide an instance of this exchange named "amq.direct". The Nameless Exchange is a special instance of the above exchange type where all queues are bound to this exchange automatically using the queue name as the routing criteria. This exchange instance has no public name, hence messages sent without specifying an exchange name are directed to this exchange.
The exchange does a wildcard match between the routing key and the routing pattern specified in the binding. The routing key is treated as zero or more more words, delimited by '.' and supports special wildcard characters. "*" matches a single word and '#' matches zero or more words.
(Click on image)
A broker is required to provide an instance of this exchange named "amq.topic".
Queues are bound to this exchange with no arguments. Hence any message sent to this exchange will be forwarded to all queues bound to this exchange.
(Click on image)
- One use case, is to use exchange chaining in a tree like hierarchy that can be used to push messages to a large number of subscribers.
- Another use case is where a direct exchange or a topic exchange can do the initial filtering which then forwards the message to a fannout exchange which will push the messages to all it's queues.
Queues are bound to this exchange with a table of arguments containing headers and values (optional). A special argument named "x-match" determines the matching algorithm, where "all" implies an AND (all pairs must match) and "any" implies OR (at least one pair must match).
(Click on image)
A broker is required to provide an instance of this exchange named "amq.match".
How AMQP Supports Common Messaging Use Cases
The most common messaging use cases are point-to-point (or store and forward) and publisher/subscriber models. These models can be easily built on top of AMQP.
routing_key == queue_name
routing_key == topic_heirarchy_value
Next Part : Part5 - Lets look at some code - Python examples
Prev Part : Part3 - Flexible Routing Model
Most pre-AMQP models had several issues with their routing models.
- Opaque routing models that were not explicitly defined.
- Since the semantics are not visible or explicit manipulating the routing model through the protocol was difficult.
- Rigid monolithic routing engines that had limited or no extensibility or compose-ability.
One of AMQP 's primary goals was to define a flexible, extensible and transparent routing model where the semantics are explicitly defined. This permits the definition of management commands to manipulate the routing model. The AMQP model consists of three components
(Click on image)
An exchange type defines a routing algorithm to match the bindings with a given message. Hence an exchange type represents a class of routing algorithm. An instance of an exchange type can be thought if as an instance of a routing algorithm. A broker can have multiple instances of an exchange type which are identified by there name. An exchange instance can have the following properties.
This is analogues to a mail box. A queue will store the messages in memory or disk and deliver them to consumers. A queue binds itself to an exchange using a 'Binding' which describes the criteria for the type of messages it is interested in. Queues can have the following properties,
- Shared/Private (exclusive)
This is analogues to a Routing Table. A binding defines the relationship between an exchange and a queue. In other words it defines the routing criteria. The most simple case is where the binding equals the queue name. A binding decouples a queue from an exchange. The same queue can be bound to any number of exchanges using the same criteria or different criteria. Different queues can be bound to the same exchange using the same routing criteria as well.
Is a special field (Header) present in the Message Delivery Properties. It can be thought of as a virtual address, analogues to a 'To' field in an email. An exchange may use this field to route a message. The standard exchange types defined in AMQP use the routing key in different ways to route messages.
Standard Exchange Types
AMQP defines several standard exchange types that are described in detail in the next blog entry.
Extending The Routing Model
One can define new exchange types with arbitrary routing criteria (routing algorithms). For example one can define an exchange that routes messages based on content (content based routing). Thus AMQP provides a standard way of extending the routing model without impacting interoperability.
Next Part : Part4 - Standard Exchange Types And Supporting Common Messaging Use Cases
Prev Part : Part2 - Achieving Interoperability And Avoiding Vendor Lock-in
Friday, October 12, 2007
One of the key issues with any software is non-interoperability and vendor lock in. Most messaging systems prior to AMQP did not interoperate with each other. For example messages from Tibco's Rendezvous couldn't be routed through IBM's MQSeries. If two messaging systems need to be connected, there are two options.
- Using a message bridge you could convert from one format to the other. However a bridge would be slow as the conversion adds latency. Also you would need to understand the wire format of each of those systems.
- Replacing one system with the other, which is costly and risky. Downtime can have a severe impact on the company's revenue model.
What if we have messaging systems(from different vendors) that can understand each other? If so connecting two messaging systems or replacing one system with the other can be done with minimum costs and risk. Since the semantics (behaviour) are the same the chance of something going wrong is relatively low. This is a key goal for the AMQP protocol.
So what does it take to achieve interoperability and avoid vendor lock-in?
- All brokers need to behave the same way
- All clients need to behave the same way
- Use a standard for commands on the wire
- Use a language neutral type System
- Use open standards and permit royalty free usage of such a protocol.
- Defining a network wire-level protocol
- A defined set of messaging capabilities (The AMQP Model)
- A simple language neutral type system
- Using open, existing, unencumbered, widely implemented standards
- Providing royalty free usage of the protocol
Ex. queueDeclare, queueBind, queuePurge, queueDelete & queueQuerry
More details on the wire protocol will be discussed later on. The next few posts will focus on discussing the semantic model.
Next Part : Part3 - Flexible Routing Model
Prev Part : Part1 - Introduction