About Me

My photo
Ravi is an armchair futurist and an aspiring mad scientist. His mission is to create simplicity out of complexity and order out of chaos.

Friday, July 23, 2010

REST

REST (REpresentational State Transfer) is a simple architectural style or philosophy
  1. that needs you to identify or address entities in the system (called "resources")
  2. and that defines the actions or operations on those entities ("access methods").
The addressing mechanism is the URI - uniform resource identifier. e.g. http://www. google.com. The supported actions or operations are PUT, GET, POST and DELETE. In particular, PUT has creation semantics, GET has fetch semantics, POST has update semantics and DELETE has remove semantics.

Universal applicability
Because of its simplicity, REST has almost universal applicability. As an example, consider a book:
  • It can be considered as a resource and referred to by its book number (ISBN, e.g. 9871234567890). So its URI can be isbn://9871234567890.
  • You can write a new book by PUTting a new resource accessible at this URI.
  • You can fetch the book by GETing it from the URI.
  • You can modify the book by POSTing to the URI.
  • You can delete the book by DELETEing the URI.

HTTP is the best known usage of REST.

Details
One of the biggest values offered by REST is the standardization of its access methods. If you came up with different access methods (i.e. verbs) to access different resources, that proliferation would be so hard to track as to be of little value. Imagine that for dealing with books, your methods are "createBook", "getBook", "updateBook", "deleteBook" and for dealing with printers, they are "createPrinter", "getPrinter", "submitPrintJob", "deletePrinter". You need to know the resource type (in this case, book v/s printer) to know the operations that it supports. This customization leads to chaos even with a small number of resource types. With uniformity of access methods comes confidence that (a) you know beforehand what access methods are supported and (b) using an access method will result in (more or less) what you think it should result in.

REST standardizes the addressing mechanism (URI) and the access methods (GET/PUT/etc.). It does not standardize the message format, i.e. the data/information flowing over the REST mechanism. E.g. you can use binary, XML, JSON or your favorite message format and still conform with REST principles.

Since PUT has creation semantics, it must be idempotent, i.e. multiple executions of PUT with the same message must be no different than a single execution. Additionally, GET must not change the state of the system. This implies that GET must be idempotent too, i.e. multiple GETs (without any POSTs in between!) should return the same representation. POSTs are expected to change the state of the entity and are not expected to be idempotent. Similarly, DELETEs are not expected to be idempotent either. One of the implications of state change and idempotence is the opportunity to cache resource representations between the resource and its clients, which can improve performance.

How to apply REST to your system
One way to RESTify your system is to:
  1. Identify the top-level, first-class nouns in the system. These become your resources.
  2. Choose meaningful identifiers in your URIs for these resources. Usually these identifiers should be long-lived, i.e. their commonly-accepted meaning should rarely change with time. They should feel relevant and meaningful to the largest subset of the client population. E.g. instead of an obscure, numeric id (e.g. user id) for a resource, a more descriptive identifier (e.g. user name) may be a better choice.
  3. Verbs/operations in the system are restricted to one of create (PUT), get (GET), modify (POST) or delete (DELETE). Their semantics should be defined as they apply to the resources. It is acceptable for POST (for example) to mean differently to different resources. E.g. POST for a book resource may mean updating the book's contents, while POST for a printer may mean "submit a print job".
  4. Make sure that GETs and PUTs are idempotent. Specifically, make sure that GETs don't change the state of the system.
That's it! Your system is now REST-compliant (or RESTful). Of course, this is a simplication and each of the steps above take non-trivial time. But on a high-level, that's all that's usually involved.

Common mistakes
  1. Sometimes, system designers make GET change the state of the system. This is the most widespread violation in my experience, e.g. when GETs are used to submit data to resources to change their state. E.g. HTTP URL like "AddToCart?item=candy" - this is a violation of REST principles. (In this specific case, POST is the right access method, since your resource is really the "cart" and you are updating it.)
  2. Another common violation is PUTing to a URI that isn't being created. E.g. when creating a new print job to the printer, you usually don't know the resource to create. But you do know the printer URI. In this case, POSTing is the right option. 

Comparing REST with SOAP RPC
This uniformity/standardization of access methods is the fundamental difference between REST and SOAP RPC. While REST allows only PUT/GET/POST/DELETE, SOAP RPC encourages ad-hoc or custom access methods (GetOrders, AddToCart, SubmitPrintJob, etc). This implies that to use SOAP RPC, you need to know the access methods a priori. This can be a big disadvantage if you are targeting universal access.

Another fundamental difference is that SOAP is a protocol, REST is not. REST is more of a guideline or a principle. If someone says, "I am using the REST protocol", now you know how much they really know about REST!

One comment I frequently hear is "we can either use REST or XML, not both". This implies that XML over REST is impossible. That's not the case. Recall that REST does not define a message format. Here's how you use XML using REST principles. You can define resources in your system, define URIs for them, restrict access methods to PUT/GET/POST/DELETE and then allow these access methods to use XML. Viola! You are now using XML with REST.

The flip side
When it comes to generality of access pattern, nothing comes close to REST/HTTP. Using a single browser, you can access almost any resource (text (txt, html, etc.), images (gif, jpg, png, etc.), sounds (mp3, ram), video (mp3, mp4), etc.) on the web. This is a clear advantage of REST. However, this doesn't mean that you have to use REST all the time. REST thrives when clients know and use generic access patterns (e.g. GET/PUT). If that is not the case, then REST is not needed. E.g. when a resource is being used by a small number of clients, each of them can have knowledge of the specific operations supported by the resource. In this case, it can be argued that REST doesn't add much value.

In Closing
REST, as an architectural principle, uniformalizes resource identification and access. This uniformity is of enormous value in general and has led to some great things, e.g. HTTP over the internet. However, before you jump onto the bandwagon, it never hurts to know your reasons.

References:
  1. http://www.prescod.net/rest/
  2. http://en.wikipedia.org/wiki/REST

Sunday, May 30, 2010

OAuth protocol simplified


OAuth protocol is an outsourced authentication protocol, masterminded  by the authenticating server. It is different from OpenID or other federated identity protocols in that OAuth accepts only a single set of credentials whereas OpenID accepts multiple sets of credentials (Yahoo id, Google id, etc.). OAuth allows the task of authentication to be performed by the one component that can authenticate the user and is in fact, in charge of it - the authentication server - even though 3rd party components (called clients) request for the authentication.

Once a service becomes a platform, 3rd parties create applications that operate on the platform. These applications need access to protected resources on the platform, which has a set of credentials to authenticate a user. However, creating application-specific credentials to authenticate the user is not optimal - e.g. for each application on the Twitter platform, it is impractical to ask a twitter user to enter application-specific username and password. Alternatively, giving platform credentials to the 3rd party application is not secure either. OAuth addresses this problem by authenticating the user with the platform and allowing the 3rd party application to access the protected resources on behalf of the user.

There are 3 distinct parties involved in the OAuth interaction.
  • resource owner - e.g. the end user with credentials on the platform (aka "the server").
  • client - e.g. the 3rd party application, which is acting on behalf of the resource owner to access protected resources on the platform ("the server")
  • server - protected resource resides here. Additionally, the server can authenticate the resource owner.

Here is a sequence diagram, highlighting the interaction between the parties involved in OAuth:

More details about the interaction:
  1. The client ("3rd party application") registers with the server ("platform") and gets a set of credentials to identify itself to the server. This is an out-of-band activity.
  2. The user tries to access a protected resource on the server via the client. At this point, the user has not authorized the client to access the resource on its behalf, i.e. the server is unaware of any such authorization.
  3. The client identifies itself to the server and includes a callback URI that will be accessed by the user once the user has successfully authenticated itself to the server.
  4. The server issues a temporary set of credentials to the client, till the resource owner successfully identifies itself to the server. The temporary credentials are expired at that time.
  5. The resource owner's user agent (e.g. "the browser") is redirected to the server along with the temporary token.
  6. User's browser connects with the server and hands over the temporary token. It then authenticates itself to the server. Additionally, it authorizes the client to request protected resources on its behalf. It is important to note that this authorization is temporary and can be revoked by the user at any time.
  7. Upon successful authentication, the browser is redirected to the callback URL, along with a piece of information called "verifier" that verifies that the user authenticated with the server and authorized the client to access protected resources on its behalf.
  8. The user's browser is redirected to the callback URL on the client with the verifier and the client's token. The client now can pass the verifier to the server, so the server knows that the user authorized the client to work on its behalf.
  9. The client passes on the verifier along with its temporary credentials to the server.
  10. Upon successful verification, the server invalidates the temporary credentials. They are not needed anymore. A more permanent set of credentials will be generated and issued.
  11. The new set of credentials are sent back to the client. They identify that the user is authenticated with the server and that user has authorized the client to access protected resources on its behalf.
  12. The client now requests access to the protected resource using the new set of credentials.
  13. The server grants access to the resource and sends a representation of the resource back to the client.
  14. The client sends the resource representation to the end user.

Here are the advantages of OAuth:
  • For 3rd party applications operating on a platform, OAuth allows the 3rd party applications to outsource the authentication functionality to the platform.
  • The credentials are contained within the platform and not exposed outside.
  • Also, there are no application-specific credentials for the resource owner to remember.
All these advantages make for a compelling case for OAuth for platforms.

Saturday, April 17, 2010

CAP theorem and its implications

CAP theorem
No distributed system can guarantee "consistency" (C), "availability" (A) and "partition tolerance" (P) at the same time.


Definition of Terms
The words "consistency", "availability" and "partition tolerance" are a common source of confusion as they mean different things to different people. Here is the definition by Gilbert and Lynch in their formal proof of the theorem, followed by my notes.

  • Consistency: "There must exist a total order on all operations, such that each operation looks as if it were completed at a single instant." In software engineering parlance, this is equivalent to having the system behave like it is executing requests in a single thread, in some sequence. A system is therefore consistent if each request sees the state of the system as if it were the only request being processed by the system at that time.
  • Availability: "For a distributed system to be continuously available, every request received by a non-failing node in the system must result in a response." In other words, the system is available if it is able to respond. The formal proof makes no distinction between arbitrarily long response times and failure to respond.
  • Partition Tolerance: "In order to model partition tolerance, the network will be allowed to lose arbitrarily many messages sent from one node to another." A system is partitioned if messages from one component to another are lost. This partitioning could be temporary or permanent, but that doesn't affect the outcome of the proof. In other words, a system is partition-tolerant if it responds correctly in the event of a network or node failure. Also, per the formal proof, "no set of failures less than total network failure is allowed to cause the system to respond incorrectly". As a quick mnemonic, I prefer to think of this property as "fault tolerance".

Implications of the theorem
Large systems need horizontal scale and hence, partition-tolerance. E.g. if one machine becomes unavailable, the system shouldn't fail. This implies that large systems only have "consistency" and "availability" to play with.


Consistency over availability
Some systems just need to be consistent. Banks and financial institutions fall in this category. It is easy to see why.  If money is transferred from account A to account B, you don't want to see it disappear from both A and B (the bank may like it but A and B won't) nor do you want to see it present in both (A and B will like it, but the banks won't!). value consistency over availability (e.g. banks), while others value availability over consistency .


Availability over consistency
This is the choice made by most large systems today (e.g. eBay, Amazon). It turns out that sacrificing consistency isn't as big a deal as it might initially feel if inconsistencies get reconciled within an acceptable time frame. This is the concept of  "eventual consistency", as opposed to total "consistency". If you list an item on eBay, you may be able to search for it only when the search sub-system gets consistent with the listing subsystem; however, both remain available while the system is in an inconsistent state.


Closing remarks
Just as a physical system has inherent limits (e.g. the speed of light or absolute zero temperature), a distributed system has one such limit involving the parameters of consistency, availability and partition tolerance. This limit helps us in making informed decisions when optimizing on the parameters that the system values most.


References

  1. Formal Proof by Gilbert and Lynch
  2. Availability v/s Partition Tolerance (Canned Platypus)
  3. BASE - an ACID alternative




Tuesday, March 16, 2010

Software Architecture

Large software is not monolithic - an undifferentiated mass of code. It is partitioned into tiers and layers. This partitioning helps in conceptualizing, architecting, designing, implementing, unit testing and maintaining the software with relative ease. This article talks about these partitioning methods, their dependencies and their interactions.

Note that this article does not talk about other types of architectures, like application architecture (partitioning an application into various interacting components), operations architecture (partitioning a system for ease of operational maintenance), etc.

Tiering
Tiering is partitioning software by the level of abstraction. The higher the abstraction, the closer the artifact being modeled resembles our conception and way of thinking (e.g. a car). Conversely, the lower the abstraction, the closer it resembles raw physical entities (e.g. database records or disk storage).



Partitions
  • Resource tier holds external services that the software depends on, e.g. database server. Usually, the server does not have access to the state of the objects contained in this tier (e.g. internal object representation of a record in a database).
  • Integration tier holds objects that are a local representation of external data, e.g. records stored on a database server. They serve to integrate with external services. At this level of abstraction, there is no business interpretation given to these objects. E.g. there are numbers retrieved from the database, but no meaning imparted to them. E.g. 1=ACTIVE, 2=DEACTIVATED, etc, does not belong here. In other words, this tier holds nothing more than a local representation of external data. This is usually the lowest level of abstraction contained within the server.
  • Business (Biz) tier holds objects that have been imparted business interpretation. Additionally, it holds factories for creating those business objects, and associated business logic. Most of the server-side action happens in this level of abstraction.
  • Presentation (Prez) tier holds objects and logic related to presentation of the business artifacts. User interfaces (display, validation, etc.) live here. Along with the Biz tier, this is another level of abstraction where the action happens. This is usually the highest level contained within the server.
  • Client tier holds client-side objects, e.g. Javascript objects. The server doesn't have direct access to their states. Objects in this tier most closely resemble our way of thinking about them, e.g. a song being played, instead of bytes in an MP3 file.
Variations
  • A "Communication" tier can be introduced between the "Integration" and "Resource" tiers, if the server and the resource communicate via a proprietary protocol. If using an open protocol (e.g. HTTP), this tier can be abstracted over and removed from being represented in the stack. 
Interaction between partitions
  • To reduce coupling and maintain clean separation of tiers, data that needs to be transferred from one tier to the next can be translated into something that the next tier understands by using tier translator. These translators would reside on the tier boundary and be the only objects that do so.


Layering
Layering is partitioning software by degree of functional specificity or reusability. As the software partition becomes more functionally specific, its reusability goes down.


Partitions
  • Kernel holds a basic set of functions and services, common to all functionality provided  by the application suite. There is one kernel per application suite. An application suite is an arbitrary grouping of applications, e.g. Microsoft Office suite.
  • Domain holds a common set of functionality, e.g. ability to search a data store, library of WYSIWYG components, etc. This allows for multiple domains in the application stack, one possibly for each set of functionality. The idea behind this partitioning is that the application can pick and choose the domains it needs to do its job, without having to depend on all domains in the stack. This reduces the application's disk/memory footprint.
  • Application holds application-specific logic, mostly process orchestration unique to the application, depending on various domains to do its job. E.g. blogging application, that depends on WYSIWYG component domain and search domain to provide the ability to create, edit and search blogs.
Variations
  • The above stack can be altered to fit the unique needs of an organization. Here are some examples:
    • A "micro kernel" can serve as the foundation of the kernel itself. This allows for multiple kernels, each specific to a particular application suite. E.g. a set of web applications and a set of desktop applications may share a common micro kernel.
    • A "domain foundation" or "application foundation" sandwiched between the kernel and the domain can provide. This layer can provide artifacts shareable by all domains. E.g. industry-standard business interfaces or processes.
Conceptualization
  • Here is one way to conceptualize the various layers (granted that this is a simple way to understand their purpose and may be hard to find in practice):
    • Theoretically, the kernel can be shared by all applications, developed by any organization, in any industry. (In reality, the kernel is rarely shared outside of the organization, but stay with me for a moment.)
    • The application foundation can be shared by all applications, in any organization within a specific industry. It may contain definitions for industry-standard business interfaces and business processes.
    • The domain can be shared by all applications within an organization needing a common functionality. It is possible that certain applications within an organization don't need the functionality offered by a particular domain. In such cases, the application won't share that domain.
    • An application offers a unique set of functionality (hopefully!).
Interaction between partitions
  • The application layer can depend on one or more domains. Nothing can depend on an application, including other applications.
  • A domain can depend on the kernel. They can also depend on other domains, but there shouldn't be a cyclic dependency - which is usually the result of incorrect dependency assumptions or incorrect partitioning of functionality into domains.
  • Kernel cannot depend on the other partitions.

Layers and Tiers
Large software is partitioned along two axes - layers and tiers. This gives rise to interesting combinations, which we look at below. I do not cover resource and client tiers, since they do not reside on the server.




Kernel layer
  • Integration tier: this tier is used to integrate with the operating system or other low-level system. This should be efficient, robust, reusable, highly available and high quality (i.e. relatively bug free) code. After all, many, if not all, applications in your application suite depend on this code.
  • Biz tier: Having business logic in the kernel is hard to justify. In my opinion, there should be no code here.
  • Prez tier: Having presentation logic is even harder to justify in this tier.
Domain layer
  • Integration tier: This is a busy tier. Most integration-tier objects (across layers) are in this tier. This provides integration with application-generic, external services, like database servers.
  • Biz tier: This is another busy tier. Logic related to a reusable functional area is found here.
  • Prez tier: This holds a library of presentation artifacts, that can be shared across multiple applications. This isn't as busy as other tiers in this layer, unless you have high reusability of presentation artifacts.
Application layer
  • Integration tier: Application-specific integration is found in this part of the neighborhood. It is hard to justify an integration artifact that cannot be interpreted to be a part of the domain (i.e. base functional area). It is possible, just unlikely.
  • Biz tier: This holds application-specific business logic. In other words, this logic is unique to this application.
  • Prez tier: This holds presentation logic unique to this application.
In closing, I think that designing and implementing an application suite as layers and tiers helps in creating clean and maintainable software.