FitProgrammer@Work: 2013

Tuesday, December 17, 2013

Ethernet Fabrics: An Overview

What is Ethernet Fabric?

Fabrics offer one alternative for meeting the demands of the modern data center. An Ethernet fabric is a type of network topology that collapses traditional three-tier data center switching architectures into one or two tiers, ensures network traffic is never more than two hops away from its destination and operationally behaves like one big switch. The goal of an Ethernet fabric is to increase the scalability, performance and resilience of highly virtualized and cloud-ready data centers.

Vendors typically sell commercial Ethernet fabrics as a product suite comprising specialized hardware and software. Fabrics can be implemented using various architectures, the most common among them leaf-spine and mesh. Several terms -- including data center fabric, switch fabric, network fabric and Ethernet fabric, or some amalgamation of all four -- are often used interchangeably to refer to this topology.

Ethernet fabric refers to flatter, highly scalable topologies that enable low latency via one- or two-hop connectivity in data center switching.

Why would one need an Ethernet Fabric?

Applications -- and by extension, the networks that delivered them -- used to be a lot simpler. Application traffic primarily flowed north and south between three tiers of switching that connected servers to clients.

The widespread adoption of server virtualization, however, transformed these devices, enabling these once-static architectures to become virtual machines capable of exploiting a data center's entire capacity by moving traffic among multiple physical servers. Applications were also becoming more complex, with various functions being broken off into different systems and components that would need to communicate across servers.

This influx of east-west traffic -- that is, traffic between and among servers -- has strained traditional three-tier switching architectures and limited scalability. Data would now have to pass through more hops to reach its destination, adding latency and consequently degrading performance.

Meanwhile, the performance and resilience of data center networks were further hamstrung by the pervasive use of Spanning Tree Protocol (STP), an algorithm that prevents bridge loops by shutting down redundant paths in favor of a single active link that transmits data. While STP is sufficient for conventional traffic flows and application architectures, it is an imperfect and fragile approach that uses bandwidth inefficiently.

Ethernet fabrics -- along with complementary technologies such as TRILL and Shortest Path Bridging (SPB) -- offer an alternative to the complexity and inefficiencies of three-tiered networks and Spanning Tree. An interconnected fabric combines the visibility of Layer 2 with the operational benefits of Layer 3.

Limitations of Ethernet Fabrics

Vendor lock-in: While a fabric offers many benefits, there is one major challenge that can be a deal-breaker for some network engineers: It almost always requires a single-vendor network. With a few exceptions, most vendors have created proprietary enhancements to standard protocols, such as TRILL and SPB, he said. This has rendered most vendors' fabrics incompatible with their competitors' infrastructure.
The other piece to this is fabrics … are not infinite [in capacity], and there are scaling limitations. "When you get into the multi-thousand-port range, you're starting to get to an area where the management headaches [are such that] you should consider segmenting.

Brocade Offerings

http://www.brocade.com/products/all/switches/index.page?network=ETHERNET_FABRIC

Brocade's TRILL-based Virtual Cluster Switching (VCS) data center fabric is designed for building large Layer 2 network domains. It debuted in 2010 with Brocade's series of fixed-form factor VDX 6700 switches, which are top-of-rack devices that cluster together in self-forming fabrics.

With Brocade VDX 8770 series, the server facing ports capacity has gone up to 8000 ports. The switches initially shipped with Gigabit Ethernet, 10 GbE and 40 GbE line cards, but the 4 Tbps per-slot backplane capacity is aimed at eventually supporting high-density 100 GbE ports. VDX 8770 features 3.5-microsecond port-to-port latency and a highly scalable MAC address table, capable of supporting up to 384,000 virtual machines in a single switch.

The VDX 8770 also has a new feature that allows customers to establish multiple load-balanced Layer 3 gateways within a Layer 2 VCS fabric, allowing them to increase the bandwidth available between Layer 2 domains.

To support high-performance networks, Brocade has upgraded its ADX ADCs (load balancers) to include a multi-tenancy feature that will allow enterprises and cloud service providers to slice up the resources of the appliances and assign virtual instances of an ADC to specific applications and services.

The ADX multi-tenancy doesn't slice resources by individual CPU core. It assigns tenants to processor subsystems, giving enterprises the ability to "mix and match capacity without having to determine where those processor hosts are in the system”.

Multi-tenancy in Ethernet Fabrics

Brocade engineered native multi-tenancy through support of an extension to Transparent Interconnections of Lots of Links (TRILL), the Internet Engineering Task Force's (IETF) standard that enables multi-path forwarding in a Layer 2 network and eliminates the need for spanning tree. The extension, known as Fine-Grained Labeling (FGL), replaces 12-bit virtual local area network (VLAN) labeling in a TRILL-based Ethernet frame with 24-bit FGL labeling. FGL expands the number of network segments that a network engineer can create in a Layer 2 network from 4,000 to 16 million.

"It's better than VLANs. And with VXLAN [Virtual Extensible LAN], NVGRE [Network Virtualization using Generic Routing Encapsulation] and STT, you're creating overlay mechanisms that step you out of Layer 2. [FGL] keeps all of that partitioning still running at Layer 2. In theory, you have better low-level control and performance."

The number of network segments in a Layer 2 network is a basic building block of a multi-tenant data center or cloud. Some vendors have tackled segmentation through overlay technologies like VMware NSX and tunneling protocols such as VXLAN and NVGRE. Brocade has embraced overlays with its support of VMware's NSX VXLAN termination endpoints on its VDX. However, not every IT organization is ready to embrace overlay products, which canblur the organizational lines between the network and server teams. "The networking guys want to manage and control the multi-tenancy solution." FGL uses the same constructs as VLANs, which promises a gentle learning curve to network pros. Also, since the technology is network-based, FGL is hypervisor-agnostic, unlike most overlay products.

Brocade enhanced the AutoQoS feature of VCS to apply automatic Quality of Service policies to storage traffic that traverses the fabric.

Brocade launched the VDX 6740 series of 10/40 Gigabit Ethernet (GbE) top-of-rack switches for the VCS fabric. Brocade is selling both fiber optic and copper versions of the VDX 6740 at the outset. The new 1RU switch has 1.28 Tbps of bandwidth with 64 10 GbE ports and 4 40 GbE ports. A 100 GbE line card for its VDX 8770 chassis, which will start shipping in the first half of 2014.

Although Brocade's VCS multi-tenancy is based on the IETF's FGL specification, Brocade's overall implementation of TRILL remains proprietary, much like the Cisco FabricPath, the only other major TRILL-based Ethernet fabric on the market. This prevents interoperability. While the data forwarding plane of VCS complies with IETF's TRILL specification, the control plane is proprietary.

Software defined networking (SDN) and Ethernet Fabrics: Do the two technologies intersect?

When we think about SDN, we think about the ability to influence network behavior from outside the network by using OpenFlow protocols. OpenFlow is largely supplementary to existing forwarding and routing techniques in a network, and it's supplementary to VCS. What many are going to do with OpenFlow is say, "I've got some unique network behavior that I would like to instantiate using OpenFlow because my networking vendor doesn't provide it natively within the switch." But they're not looking for OpenFlow to displace all of their routing and porting techniques. Outsourcing the entire control plane would be a pretty big bite to take. Brocade’s intent is to implement OpenFlow, both within the VCS fabric and other data center platforms, in that same supplementary fashion.

Another variant of SDN is network virtualization, the ability to apply logical networks or overlay networks on top of an existing physical network infrastructure. One of the benefits of that is to give the customer more freedom in terms of scalability and going beyond VLAN ID and MAC address table size limitations. In introducing logical networks, you're actually increasing the overall administrative overhead that the customer has to deal with. The beauty of a VCS fabric is that through automation and simplicity, you can reduce your administrative burden and your operational overhead in the physical network infrastructure, and invest more time in that logical overlay.

Converged Storage Networking (FCoE): What is the future?

There is some [modest] adoption in convergence and FCoE from the server to the first hop in the network. That is gaining some momentum because it's an opportunity for customers to reduce the adapters in the server and simplify cabling. VCS is capable of supporting that convergence to the switch, at which point we're capable of breaking out native Fibre Channel traffic on the SAN [storage area network] and non-storage traffic on the Ethernet LAN.

More important from a convergence point of view is the growth of IP storage and our ability to leverage VCS to make NAS [network attached storage] or iSCSI work better. VCS provides a lower-latency and higher-throughput environment than a conventional LAN, so it's a better IP storage transport. Also, VCS has a scaling property that aligns with the way customers think about scaling up their NAS environments. They want to be able to add pods of storage and have that federate into the existing IP storage architecture. That's the way you scale a VCS fabric.

References

Thursday, November 07, 2013

Cloud Operations with SAN Infrastructure Visibility – Brocade | VMware Cloud Management - VMware Blogs

Cloud Operations with SAN Infrastructure Visibility – Brocade | VMware Cloud Management - VMware Blogs:

Cool blog out this morning by Wayne Greene, VMware’s snr director in charge of cloud management about the Brocade SAN Analytics Management Pack for vCenter Operations Management Suite.

More details at Brocade site - http://community.brocade.com/t5/Data-Center/SAN-Visibility-for-Virtual-Infrastructure-Managers/ba-p/249

Friday, November 01, 2013

Eclipse 4.2 in Ubuntu 12.04

Eclipse 4.2 in Ubuntu 12.04 | Bruno Braga:

Install the version of eclipse that comes with Ubuntu repo (for 12.04 its eclipse 3.8)
Then follow the below steps to update that version with the latest eclipse. After that, when you click on the original Eclipse icon it will launch the updated eclipse.
Note: We will need to redo the steps every time we want to update eclipse installation.

# Get the Eclipse installer for Linux 
# (if you do not know the flavour, just choose "Classic")
# http://www.eclipse.org/downloads/
 
 
# Unpack it
$ tar -zxvf eclipse-SDK-4.2-linux-gtk.tar.gz
 
 
# if you have a previous Eclipse version installed
# just move it (in case anything goes terribly wrong, 
# you can just rollback)
$ sudo mv /usr/lib/eclipse /usr/lib/eclipse-old
 
 
# move the unpacked directory to lib
$ sudo mv eclipse /usr/lib/

Why OpenTSDB chose HBase for Time Series data storage? - Stack Overflow

Why OpenTSDB chose HBase for Time Series data storage? - Stack Overflow:

Nice reply to this question by OpenTSDB author for why StumbleUpon uses HBase (and in fact facebook does the same for big data analytics):

I chose HBase because it scales. Whisper is much like RRD, it's a fixed-size database, it must destroy data in order to work within its space constraints. HBase offers the following properties that make it very well suited for large scale time series databases:

Linear scaling. Want to store data? Add more nodes. At StumbleUpon, where I wrote OpenTSDB, our time series data was co-located on a 20-node cluster that was primarily used for analytics and batch processing. The cluster grew to 120 nodes fairly quickly, and meanwhile OpenTSDB, which makes up only a tiny fraction of the cluster's workload, grew to half a trillion data points.

Automatic replication. Your data is stored in HDFS, which by default means 3 replicas on 3 different machines. If a machine or a drives dies, no big deal. Drives and machines die all the time when you build commodity servers. But the thing is: you don't really care.

Efficient scans. Most time series data is used to answer questions that are like "what are the data points between time X and Y". If you structure your keys properly, you can implement this very efficiently with HBase with a simple scan operation.

The fact that HBase is column oriented wasn't nearly as important a consideration as the fact that it's a big sorted key-value system that really scales.
All RRD-based and RRD-derived tools couldn't satisfy the scale requirements of being able to accuratelystore billions and billions of data points forever for very cheap (just a few bytes of actual disk space per data point).

TempoDB docs:

Wednesday, October 30, 2013

Installing Groovy & Grails on Linux

http://gvmtool.net/

GVM is a tool for managing parallel Versions of multiple Software Development Kits on most Unix based systems. It provides a convenient command line interface for installing, switching, removing and listing Candidates.

GVM was inspired by the highly useful RVM and rbenv tools, used at large by the Ruby community.

Candidates that are currently supported are:

Gaiden
Groovy
Grails
Griffon
Gradle
Groovyserv
Lazybones
Spring Boot
vert.x

Groovy REST Client

Using RESTClient for Groovy - http://groovy.codehaus.org/modules/http-builder/doc/rest.html

Example code:
package restclient

import groovyx.net.http.RESTClient
import static groovyx.net.http.ContentType.JSON

restClient = new RESTClient( 'http://localhost:8080/CustomerDB/webresources/', JSON)
def resp = restClient.get(path: 'entities.customer')
assert resp.status == 200
println resp.data
def respHeaders = resp.getAllHeaders()
respHeaders.each { header ->
println header
}

Tuesday, October 29, 2013

Groovy By Example: For Java Programmers

This is based on - http://www.infoq.com/presentations/Groovy-for-Java-Programmers

Groovy is a JVM scripting language, it supports dynamically typed variable, comes with groovyc (interpreter), groovyConsole (small Swing editor). It interoperates with Java seamlessly - can call any Java method and use any Java class. The same holds true for Java - it can call into groovy scripts.

Example 1:
class GroovyPerson {
// dynamically typed - no semi colons needed.
def age
// statically typed
String name

def printName() {
println name
}

static void main(String[] args) {
// default constructor can accept any number of properties - no getters/setters
def person = new GroovyPerson(age:7, name: 'Jake')
person.printName()
}
}

Example 2:

def cal = Calendar.instance
cal.clear() -- parens required if not passing params to a method
cal.set Calendar.MONTH, Calendar.JULY -- parens optional if passing params to method.
cal.set Calendar.DATE, 4
cal.set Calendar.YEAR, 1776

def time = cal.time

println time

Change 2 to Example 3:
def cal = Calendar.instance
cal.with { -- with() method is added to all classes in java because of groovy to support this usage.
clear()
set MONTH, JULY
set DATE, 4
set YEAR, 1776
println time
}

with {} method on an object lets us remove cal. from within the with block.

Everything is Groovy is an object.
Example 4:
result = 10
println result.class

outputs: class java.lang.Integer

Example 5:

aString = 'This is a string' -- a regular String is single quoted

answer = 42

aGString = "The answer is $answer" -- a GString (may contain embedded groovy code) is double quoted

Example 6:

message = 'Groovy Is Cool' --- last character in string is -1 index, first is 0 index.

println message[0] //G

println message[-4] // C

println message[0..5] // Groovy

println message[-4..-1] //Cool

println message[-1..-4] // looC -- substrings can be gotten with ranges

Example 7: List
names = ['Watsh', 'Manish', 'Saket']

println names.class -- java.util.ArrayList.
names << 'Rahul' //-- add to the list
println names

Example 8: Maps
myMap = [name:'Watsh', language: 'Groovy']
myMap.town = 'San Ramon' -- put value in map for town key
myMap['company'] = 'Brocade' -- put value in map for company key

println myMap.getClass() -- java.util.LinkedHashMap (default)

println myMap.company

Example 9: Class - no getter/setter for properties
class BaseballTeam {
def city
def team

def getDisplayName() {
"$city $team"

}
}

myTeam = new BaseballTeam(team: 'Bulls', city: 'San Ramon')

println myTeam.team
println myTeam.city
println myTeam.getDisplayName()

Example 10: Closure
myClosure = {
println "This is closure object $it" -- 'it' is implicit object for a closure.
}
3.times myClosure -- times() method on Integer can take a closure and invoke it n times.

outputs:

This is closure object 0
This is closure object 1
This is closure object 2

Return statement is optional. If no return statement found, the last statement in method is evaluated and returned.

Example 11: More Closure
// passing argument to closure
3.times {firstArg ->
println "The argument is $firstArg"
}

// using closure to iterate a hashmap
data = [company:'Brocade',
name: 'Watsh']

data.each {key, value ->
println "$key is $value"
}

Installing Oracle JDK on Ubuntu

JDK version 1.7.0 update 45
Linux Mint 15 x86-64

Reference: http://www.wikihow.com/Install-Oracle-Java-JDK-on-Ubuntu-Linux

Remove OpenJDK

sudo apt-get purge openjdk-\*

sudo mkdir -p /usr/local/java
sudo cp -r jdk-7u45-linux-x64.tar.gz /usr/local/java
cd /usr/local/java
Extract the archive:

sudo chmod a+x jdk-7u45-linux-x64.tar.gz
sudo tar xvzf jdk-7u45-linux-x64.tar.gz

Create a symbolic link as: jdk7 -> jdk1.7.0_u45

sudo ln -s jdk1.7.0_45 jdk7

This is so to avoid doing all the steps when the next time we update the jdk then we just change the jdk7 symbolic link to point to the then latest jdk.

Add JAVA_HOME to PATH environment variable:

sudo gedit /etc/profile
Paste and save:

JAVA_HOME=/usr/local/java/jdk7
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
export JAVA_HOME
export PATH

Inform Linux system where Oracle JDK is located:

sudo update-alternatives --install "/usr/bin/java" "java" "/usr/local/java/jdk7/bin/java" 1
sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/local/java/jdk7/bin/javac" 1
sudo update-alternatives --install "/usr/bin/javaws" "javaws" "/usr/local/java/jdk7/bin/javaws" 1

Inform Linux system that Oracle JDK must be the default Java:

sudo update-alternatives --set java /usr/local/java/jdk7/bin/java
sudo update-alternatives --set javac /usr/local/java/jdk7/bin/javac
sudo update-alternatives --set javaws /usr/local/java/jdk7/bin/javaws

Reload /etc/profile.

. /etc/profile

Test to see if installation is correct:

java -version
javac -version
javaws -version

REST Web APIs - Part 1

This my notes from the book REST Web APIs by Mike Amundsen.

RESTful architectures are designed for managing change.
We all understand hypermedia in the context of the Web. It’s just a fancy word for links. Web pages link to each other, and the result is the World Wide Web, driven by hypermedia. Hypermedia is the single most important aspect of REST, and the least understood.
We say that a URL is the URL of some thing: a product, a user, the home page. The technical term for the thing named by a URL is resource.
When a web browser sends an HTTP request for a resource, the server sends a document in response (usually an HTML document, but sometimes a binary image or something else). Whatever document the server sends, we call that document a representation of the resource.
URL identifies one and only one resource. If a website has two conceptually different things on it, we expect the site to treat them as two resources with different URLs.
The principle of addressability just says that every resource should have its own URL.
Application state is kept on the client, but the server can manipulate it by sending representations — HTML documents, in this case — that describe the possible state transitions. Resource state is kept on the server, but the client can manipulate it by sending the server a representation — an HTML form submission, in this case — describing the desired new state.
The strands of the web are the HTML tags and
tags, each describing a GET or POST HTTP request Alice might decide to make. I call this the principle of connectedness: each web page tells you how to get to the adjoining pages. The Web as a whole works on the principle of connectedness, which is better known as “hypermedia as the engine of application state,” sometimes abbreviated HATEOAS.
You know what application state is — it’s which web page a client is on. Hypermedia is the general term for things like HTML links and forms: the techniques a server uses to explain to a client what it can do next. To say that hypermedia is the engine of application state is to say that we all navigate the Web by filling out forms and following links.
All successful post-Web protocols do something the Web can’t do: peer-to-peer protocols like BitTorrent and real-time protocols like SSH. For most purposes, HTTP is good enough.
The unprecedented flexibility of the Web comes from the principles of REST.
In REST terms, putting information about URL construction in separate human-readable documents violates the principles of connectedness and self-descriptive messages. In REST terms, the website redesign is entirely encapsulated in the self-descriptive HTML documents served by the website. A client that could understand the old HTML documents can understand the new ones.
The HTTP standard says that a GET request is a request for a representation. It’s not intended to change any resource state on the server.
application/vnd.collection+json, you’ll discover that it’s a media type registered for Collection+JSON. Collection+JSON is a standard for publishing a searchable list of resources over the Web. JSON puts constraints on plain text, and Collection+JSON puts constraints on JSON. A server can’t serve just any JSON document as application/vnd.collection+json. It can only serve a JSON object: {} But not just any object. The object has to have a property called collection, which maps to another object: {"collection": {}} The “collection” object ought to have a property called items that maps to a list: {"collection": {"items": []}} The items in the “items” list need to be objects: {"collection": {"items": [{}, {}, {}]}} And on and on, constraint after constraint. Eventually you get the highly formatted document you just saw, which starts out like this:

A typical Collection+JSON will contain a set of links, list of items, a queries collection, and a template object.

{ "collection" :
  {
    "version" : "1.0",
    "href" : "http://example.org/friends/",
    
    "links" : [
      {"rel" : "feed", "href" : "http://example.org/friends/rss"}
    ],
    
    "items" : [
      {
        "href" : "http://example.org/friends/jdoe",
        "data" : [
          {"name" : "full-name", "value" : "J. Doe", "prompt" : "Full Name"},
          {"name" : "email", "value" : "jdoe@example.org", "prompt" : "Email"}
        ],
        "links" : [
          {"rel" : "blog", "href" : "http://examples.org/blogs/jdoe", "prompt" : "Blog"},
          {"rel" : "avatar", "href" : "http://examples.org/images/jdoe", "prompt" : "Avatar", "render" : "image"}
        ]
      },
      
      {
        "href" : "http://example.org/friends/msmith",
        "data" : [
          {"name" : "full-name", "value" : "M. Smith", "prompt" : "Full Name"},
          {"name" : "email", "value" : "msmith@example.org", "prompt" : "Email"}
        ],
        "links" : [
          {"rel" : "blog", "href" : "http://examples.org/blogs/msmith", "prompt" : "Blog"},
          {"rel" : "avatar", "href" : "http://examples.org/images/msmith", "prompt" : "Avatar", "render" : "image"}
        ]
      },
      
      {
        "href" : "http://example.org/friends/rwilliams",
        "data" : [
          {"name" : "full-name", "value" : "R. Williams", "prompt" : "Full Name"},
          {"name" : "email", "value" : "rwilliams@example.org", "prompt" : "Email"}
        ],
        "links" : [
          {"rel" : "blog", "href" : "http://examples.org/blogs/rwilliams", "prompt" : "Blog"},
          {"rel" : "avatar", "href" : "http://examples.org/images/rwilliams", "prompt" : "Avatar", "render" : "image"}
        ]
      }      
    ],
    
    "queries" : [
      {"rel" : "search", "href" : "http://example.org/friends/search", "prompt" : "Search",
        "data" : [
          {"name" : "search", "value" : ""}
        ]
      }
    ],
    
    "template" : {
      "data" : [
        {"name" : "full-name", "value" : "", "prompt" : "Full Name"},
        {"name" : "email", "value" : "", "prompt" : "Email"},
        {"name" : "blog", "value" : "", "prompt" : "Blog"},
        {"name" : "avatar", "value" : "", "prompt" : "Avatar"}
        
      ]
    }
  } 
}

Collection+JSON is a way of serving lists — not lists of data structures, which you can do with normal JSON, but lists that describe HTTP resources. The collection object has an href property, and its value is a JSON string. But it’s not just any string — it’s the URL I just sent a GET request to: { "collection": { "href" : "http://www.youtypeitwepostit.com/api/" } } The Collection+JSON standard defines this string as “the address used to retrieve a representation of the document” (in other words, it’s the URL of the collection resource). Each object inside the collection’s items list has its own href property, and each value is a string containing a URL, like http://www.youtypeitwepostit.com/api/messages/21818525390699506 (in other words, each item in the list represents an HTTP resource with its own URL).
A document that doesn’t follow these rules isn’t a Collection+JSON document: it’s just some JSON. By allowing yourself to be bound by Collection+JSON’s constraints, you gain the ability to talk about concepts like resources and URLs. These concepts are not defined in JSON, which can only talk about simple things like strings and lists.
To create a new item in the collection, the client first uses the template object to compose a valid item representation and then uses HTTP POST to send that representation to the server for processing.
Collection+JSON works along the same lines as HTML. The server provides you with some kind of form (the template), which you fill out to create a document. Then you send that document to the server with a POST request.
Its template property is the "template object” mentioned in the Collection+JSON specification: { ... "template": { "data": [ {"prompt": "Text of message", "name": "text", "value":""} ] } To fill out the template, I replace the empty string under value with the string I want to publish: { "template": { "data": [ {"prompt": "Text of the message", "name": "text", "value": "Squid!"} ] } } I then send the filled-out template as part of an HTTP POST request: POST /api/ HTTP/1.1 Host: www.youtypeitwepostit.com Content-Type: application/vnd.collection+json { "template": { "data": [ {"prompt": "Text of the message", "name": "text", "value": "Squid!"} ] } }
The server responds: HTTP/1.1 201 Created Location: http://www.youtypeitwepostit.com/api/47210977342911065 The 201 response code (Created) is a little more specific than 200 (OK); it means that everything is OK and that a new resource was created in response to my request. The Location header gives the URL to the newborn resource.
REST is not a protocol, a file format, or a development framework. It’s a set of design constraints: statelessness, hypermedia as the engine of application state, and so on. Collectively, we call these the Fielding constraints, because they were first identified in Roy T. Fielding’s 2000 dissertation on software architecture, which gathered them together under the name “REST.”
A resource is anything that’s important enough to be referenced as a thing in itself. If your users might “want to create a hypertext link to it, make or refute assertions about it, retrieve or cache a representation of it, include all or part of it by reference into another representation, annotate it, or perform other operations on it” (Architecture), you should make it a resource. Giving something a URL turns it into a resource.
When a client issues a GET request for a resource, the server should serve a document that captures the resource in a useful way. That’s a representation — a machine-readable explanation of the current state of a resource.
The server might describe a database row as an XML document, a JSON object, a set of comma-separated values, or as the SQL INSERT statement used to create it. These are all legitimate representations; it depends on what the client asks for. A representation can be any machine-readable document containing any information about a resource. We think of representations as something the server sends to the client. That’s because when we surf the Web, most of our requests are GET requests. We’re asking for representations. But in a POST, PUT, or PATCH request, the client sends a representation to the server. The server’s job is then to change the resource state so it reflects the incoming representation.
The server sends a representation describing the state of a resource. The client sends a representation describing the state it would like the resource to have. That’s representational state transfer.
If a DELETE request succeeds, the possible status codes are 204 (No Content, i.e., “it’s deleted, and I don’t have anything more to say about it”), 200 (OK, i.e., “it’s deleted, and here’s a message about that”); and 202 (Accepted, i.e., “I’ll delete it later”).
If a client tries to GET a resource that has been DELETEd, the server will return an error response code, usually 404 (Not Found) or 410 (Gone): GET /api/
DELETE method has another useful property: it’s idempotent. Once you delete a resource, it’s gone. The resource state has permanently changed. You can send another DELETE request, and you might get a 404 error, but the resource state is exactly as it was after the first request. The resource is still gone.
POST request to a resource creates a new resource underneath it. The most common response code to a POST-to-append request is 201 (Created). It lets the client know that a new resource was created. The Location header lets the client know the URL to this new resource. Another common response code is 202 (Accepted), which means that the server intends to create a new resource based on the given representation, but hasn’t actually created it yet.
A PUT request is a request to modify resource state. The client takes the representation it got from a GET request, modifies it, and sends it back as the payload of a PUT request. If the server decides to accept a PUT request, the server changes the resource state to match what the client says in the representation, and usually sends either 200 (OK) or 204 (No Content). PUT is idempotent, just like DELETE. The client can also use PUT to create a new resource, if it knows the URL where the new resource should live. PUT is an idempotent operation even when you use it to create a new resource.
The PATCH method allows for this. Instead of PUTting a full representation, you can create a special “diff” representation and send it to the server as the payload of a PATCH request.
PATCH /my/data HTTP/1.1 Host: example.org Content-Length: 326 Content-Type: application/json-patch+json If-Match: "abc123" [ { "op": "test", "path": "/a/b/c", "value": "foo" }, { "op": "remove", "path": "/a/b/c" }, { "op": "add", "path": "/a/b/c", "value": [ "foo", "bar" ] }, { "op": "replace", "path": "/a/b/c", "value": 42 }, { "op": "move", "from": "/a/b/c", "path": "/a/b/d" }, { "op": "copy", "from": "/a/b/d", "path": "/a/b/e" } ]
The best response codes for a successful PATCH are the same as for PUT and DELETE: 200 (OK) if the server wants to send data (such as an updated representation of the resource) along with its response, and 204 (No Content) if the server just wants to indicate success. PATCH is neither safe nor idempotent.

Monday, October 28, 2013

XPath Cheatsheet

<root xmlns:foo="http://www.foo.org/" xmlns:bar="http://www.bar.org">
 <actors>
  <actor id="1">Christian Bale</actor>
  <actor id="2">Liam Neeson</actor>
  <actor id="3">Michael Caine</actor>
 </actors>
 <foo:singers>
  <foo:singer id="4">Tom Waits</foo:singer>
  <foo:singer id="5">B.B. King</foo:singer>
  <foo:singer id="6">Ray Charles</foo:singer>
 </foo:singers>
</root>

Excerpted from: http://www.freeformatter.com/xpath-tester.html

1. Select the document node

2. Select the 'root' element

/root

3. Select all 'actor' elements that are direct children of the 'actors' element.

/root/actors/actor

4. Select all 'singer' elements regardless of their positions in the document.

//foo:singer

5. Select the 'id' attributes of the 'singer' elements regardless of their positions in the document.

//foo:singer/@id

6. Select the textual value of first 'actor' element.

//actor[1]/text()

7. Select the last 'actor' element.

//actor[last()]

8. Select the first and second 'actor' elements using their position.

//actor[position() < 3]

9. Select all 'actor' elements that have an 'id' attribute.

//actor[@id]

10. Select the 'actor' element with the 'id' attribute value of '3'.

//actor[@id='3']

11. Select all 'actor' nodes with the 'id' attribute value lower or equal to '3'.

//actor[@id<=3]

12. Select all the children of the 'singers' node.

/root/foo:singers/*

13. Select all the elements in the document.

//*

14. Select all the 'actor' elements AND the 'singer' elements.

//actor|//foo:singer

15. Select the name of the first element in the document.

name(//*[1])

16. Select the numeric value of the 'id' attribute of the first 'actor' element.

number(//actor[1]/@id)

17. Select the string representation value of the 'id' attribute of the first 'actor' element.

string(//actor[1]/@id)

18. Select the length of the first 'actor' element's textual value.

string-length(//actor[1]/text())

19. Select the local name of the first 'singer' element, i.e. without the namespace.

local-name(//foo:singer[1])

20. Select the number of 'singer' elements.

count(//foo:singer)

21. Select the sum of the 'id' attributes of the 'singer' elements.

sum(//foo:singer/@id)

Thursday, October 24, 2013

OpenID Connect - OAuth 2.0 with Authentication

Connect | OpenID:
OpenID Connect 1.0 is a simple identity layer on top of the OAuth 2.0 protocol. It allows Clients to verify the identity of the End-User based on the authentication performed by an Authorization Server, as well as to obtain basic profile information about the End-User in an interoperable and REST-like manner.
'via Blog this'

Single Sign-on with Shibboleth IDP Server

What is Single Sign-on?
Single sign-on (SSO) is a property of access control of multiple related, but independent software systems. With this property a user logs in once and gains access to all systems without being prompted to log in again at each of them. Conversely, Single sign-off is the property whereby a single action of signing out terminates access to multiple software systems.

Single sign-on requires that users literally sign in once to establish their credentials. Systems which require the user to log in multiple times to the same identity are inherently not single sign-on. For example, an environment where users are prompted to log into their desktop, then log into their email using the same credentials, is not single sign-on.

SSO shares centralized authentication servers that all other applications and systems use for authentication purposes and combines this with techniques to ensure that users do not have to actively enter their credentials more than once.

Common SSO Configurations:

Kerberos based - Initial sign-on prompts the user for credentials, and gets a Kerberos ticket-granting ticket (TGT). Additional software applications requiring authentication, such as email client or intranet portals, etc. use TGT to aquire service tickets, proving user's idetity to the mailserver or portal server and thus does not require user to re-enter credentials.
Smart card based - Initial sign-on requires use of smart card. Smart card-based single sign-on can either use certificates or passwords stored on the smart card.
OTP based - (one time password) - Two-factor authentication with OTP tokens
SAML - Security Assertion Markup Language (SAML) is an XML-based solution for exchanging user security information between an enterprise and a service provider. It supports W3C XML encryption and service provider initiated web single sign-on exchanges. The user is called the subject in the SAML-based single sign-on. The identity provider is the one that provides the user credentials. The service provider trusts the identity provider on the user information to provide access to its services or resources. One such SAML based SSO solution is Shibboleth which we discuss below.

Shibboleth Identity Provider
Shibboleth is one of world's most deployed federated identity solutions. The Shibboleth software implements widely used federated identity standards, principally the OASIS Security Assertion Markup Language (SAML), to provide a federated single sign-on and attribute exchange framework. A user authenticates with his or her organizational credentials, and the organization (or identity provider) passes the minimal identity information necessary to the service provider to enable an authorization decision. IDP only provides authentication and not authorization.
The Identity Provider provides Single Sign-On services and extends reach into other organizations and new services through authentication of users and securely providing appropriate data to requesting services. In addition to a simple yes/no response to an authentication request, the Identity Provider can provide a rich set of user-related data to the Service Provider. This data can help the service provide a more personalized user experience, save the user from having to manually enter data the service requires, and refresh the data each time the user logs onto the service.

The normal Identity Provider process is:

Accept a SAML authentication request from the Service Provider a user wants to access;
Authenticate the user against your organization's existing authentication service;
Collect user data from your organization's existing data stores;
Apply policy to control what data is released to which Service Provider;
Securely transmit the collected information to the Service Provider.

Single Sign-on Steps

Step 1: User accesses the Resource

The user starts by attempting to access the protected resource. The resource monitor determines if the user has an active session and, discovering that they do not, directs them to the service provider in order to start the SSO process.

Step 2: Service Provider issues Authentication Request

The user arrives at the Service Provider which prepares an authentication request and sends it and the user to the Identity Provider. The Service Provider software is generally installed on the same server as the resource.

Step 3: User Authenticated at Identity Provider

When the user arrives at the Identity Provider it checks to see if the user has an existing session. If they do, they proceed to the next step. If not, the Identity Provider authenticates them (e.g. by prompting for, and checking, a username and password) and the user proceeds to the next step.

Step 4: Identity Provider issues Authentication Response

After identifying the user, the Identity Provider prepares an authentication response and sends it and the user back to the Service Provider.

Step 5: Service Provider checks Authentication Response

When the user arrives with the response from the Identity Provider, the Service Provider will validate the response, create a session for the user, and make some information retrieved from the response (e.g. the user's identifier) available to the protected resource. After this, the user is sent to the resource.

Step 6: Resource returns Content

As in Step 1, the user is now trying again to access the protected resource, but this time the user has a session and the resource knows who they are. With this information the resource will service the user's request and send back the requested data.

Federated Single Sign-on

If you have heard about Shibboleth you have probably also heard something about "federations" or "Federated Single Sign-on". The steps above are common to all SSO systems, but some of these systems are designed to only work when the Identity Provider and Service Provider are in the same organization, whilst others are designed to work regardless of whether the two components are in the same organization. Implementations that fall into the later category are said to implement Federated Single Sign-on.
It is not uncommon that a given Service Provider may wish to work with more than one Identity Provider (e.g. commercial services with multiple customers, resources used by researchers at multiple organizations), and likewise a given Identity Provider might wish to work with multiple Service Providers. When a group of Identity and Service Providers agree to work together, this group is called a federation.

There have been security concerns pertaining to Federated SSO (which is gaining alot of popularity in the web sites these days) one example excerpted from Wikipedia is reproduced below:

The problem we see now is that many websites are adopting Facebook’s “Connect” and OpenID to allow for one-click logins to access a website. You sometimes don’t even have the choice of making a separate account on that site, meaning you can’t “opt out” of these SSOs. Sure, your information stays safe with that site, but it’s also stored within a central database under Facebook’s control. While there’s nothing wrong with this, there’s just too much risk involved in putting all your sensitive data from all over the web into one massive identity bubble.

—"Does Facebook Federation Have Your Best Interests At Heart?

Deploying Shibboleth Identity Provider

Install tomcat (or one of the supported web containers) and deploy Shibboleth WAR.
Configure IDP login handler for one of the supported authentication types. Simplest will be to have user authenticated based on their IP address as described here - https://wiki.shibboleth.net/confluence/display/SHIB2/IdPAuthIP
Configure IDP Server - https://wiki.shibboleth.net/confluence/display/SHIB2/IdPConfiguration

Service provider (application requiring SSO) identified as relying party
Generate self-signed certificate for IDP server and import to IDP server’s trust store. Used to enable SSL for tomcat.
Export Service Provider's server certificate and import it to IDP server.
Export IDP server certificate and import it to Service provider server.
IDP server metadata URI (which identifies the URI for the service provider where authentication assertion needs to be sent). For example,

IDP metadata URI - https://:8443/idp.xml
Service provider assertion consumer URI - https:///assertionconsumer

One alternative to Shibboleth (SAML based) is to use OpenID based solution like OpenID4Java, which i will capture in a following post.

Tuesday, October 22, 2013

Python

Python is an interpreted and Object oriented language.

Variable is simply defined as:
my_int=7
my_float=1.23
my_bool=True
my_string = "Always look on the bright side of life!"

Never use quotation marks (' or ") with booleans, and always capitalize the first letter! Python is case-sensitive (it cares about capitalization).

In Python, statements are separated by whitespace (no ; at the end!).

Code needs to be indented by 4 spaces (or a tab).
Ex:
def spam():
eggs = 12
return eggs

Comments begin with # or use triple quotation marks for a multi-line comment.
Ex:
""" Hi there
this is my comment

Long one indeed!"""

Arithmetic operators: +, -, *, /, % and ** (Exponentiation). The ** operator raises the first number, the base, to the power of the second number, theexponent.
Ex:
my_var = 10**2
Ans: 100

Backslash (\) escapes a string with quotes.
Ex:
'Help! Help! I\'m being repressed!'

So if you wanted "Y", you could just type "PYTHON"[1].

String methods: string methods are pre-built pieces of code that perform specific tasks for strings. They are: len(), lower(), upper() and str()
Ex:
print "The value of pi is around " + str(3.14)
+ works as string concat operator but first we need to convert 3.14 (a float) to string using str()

print "The %s who %s %s!" % ("Knights", "say", "Ni")

Getting user input on Console:
name = raw_input("What is your name?")

Using datetime:
from datetime import datetime

now = datetime.now()
print str(now.month) + "/" + str(now.day) + "/" + str(now.year)