Wednesday, March 14, 2012

Cassandra PHPCassa & Composite Types

This post is updated inorder to support phpcassa 1.0.a.1

Cassandra Composite Type using PHPCassa


phpcassa 1.0.a.1 uses namespaces in PHP which is supported in PHP 5 >= 5.3.0
Make sure you have the relavant package.
The script mentioned below is the copy of PHPCassa Composite Example

I will explain it step by step

(1) Creating Keyspace using PHPCassa
        Name => "Keyspace1"
        Replication Factor => 1
        Placement Strategy => Simple Strategy
(2) Creating Column Family with Composite Keys using PHPCassa
        Name => "Composites"
        Column Comparator => CompositeType of LongType, AsciiType (Ex: 1:example)
        Row Key Validation => CompositeType of AsciiType, LongType (Ex: example:1)
        Sample Row:
                'example':1 => { 1:'columnName': "value", 1:'d' => "Hai", 2:'b' => "Fine", 112:'a' => "Sorry" }
        Columns are sorted Based on Component types as shown above
        112 > 2 as LongType but "112" < "2" as Ascii         Cassandra Properly honors the type mentioned on column family definition         I have used '' to denote ascii. Ignore them as values
require_once(__DIR__.'/../lib/autoload.php');

use phpcassa\Connection\ConnectionPool;
use phpcassa\ColumnFamily;
use phpcassa\ColumnSlice;
use phpcassa\SystemManager;
use phpcassa\Schema\StrategyClass;

// Create a new keyspace and column family
$sys = new SystemManager('127.0.0.1');
$sys->create_keyspace('Keyspace1', array( // (1)
    "strategy_class" => StrategyClass::SIMPLE_STRATEGY,
    "strategy_options" => array('replication_factor' => '1')
));

// Use composites for column names and row keys
$sys->create_column_family('Keyspace1', 'Composites', array( //(2)
    "comparator_type" => "CompositeType(LongType, AsciiType)",
    "key_validation_class" => "CompositeType(AsciiType, LongType)"
));


Start a connection pool, create an instance of Composites ColumnFamily
$pool = new ConnectionPool('Keyspace1', array('127.0.0.1'));
$cf = new ColumnFamily($pool, 'Composites');
Specifying Row Keys and Column Keys
Both our row key [key_validation_class] and column key [comparator] are composite types.
That means our key has components in them and types of each component might differ
So, we can't specify the keys as a single entity. They might violate the data types that cassandra cluster expects
For ex: in our case of row keys: Component 1 is Ascii & Component 2 is Long
When a write or read request is sent to cassandra, the type property should be properly maintained
Specifying "key:1" won't work and would result in an cassandra exception

Hence we maintain components of key as a php array and specify insert_format & return_format as an array.
Ex: $key1 = array("key", 1); //Ascii, Long
Other available formats for insert and return are
  • DICTIONARY // Here, array keys correspond to row keys. So, we can't use this as our keys have components
  • OBJECT // This is almost same that thrift returns
Whereas for columns, each column corresponds to a value. Hence it will be array ( array ( components ) , value )
Here the array inside an array is required because php associative arrays don't support anything other than string keys.
As we need to preserve type. We can't specify "columnKey"=>value anymore.
Hence we map them in to an array as array(key, value) where key itself is an array(components)
// Make it easier to work with non-scalar types
$cf->insert_format = ColumnFamily::ARRAY_FORMAT;
$cf->return_format = ColumnFamily::ARRAY_FORMAT;

// Composite Row Keys ()
$key1 = array("key", 1);
$key2 = array("key", 2);

$columns = array(
    array(array(0, "a"), "val0a"),

    array(array(1, "a"), "val1a"),
    array(array(1, "b"), "val1b"),
    array(array(1, "c"), "val1c"),

    array(array(2, "a"), "val2a"),

    array(array(3, "a"), "val3a")
);

$cf->insert($key1, $columns);
$cf->insert($key2, $columns);

Then we fetch data
(1) Get all the columns corresponding to a key
(2) insert and return format is array so accessing via index
(3) Should output an array of components of column name
//Constructor of Column Slice
__construct( mixed $start = "", mixed $finish = "", integer $count = phpcassa\ColumnSlice::DEFAULT_COLUMN_COUNT, boolean $reversed = False ) 

(4) ColumnSlice => ColumnSlice(array(1), array(1))
  1. $start => array, means composite type
    Ex: array(component, array(component, INCLUSIVE_FLAG), ...) // inner array is component specific and required only if you wish to override INCLUSIVE_FLAG
  2. $end => Same as $first
So, we ask for all columns whose first component [note the array, coz of composite type] is with value 1 to 1.
And that Indirectly means, all columns with first component 1
(5) $start=> "" means beginning of the row and
array(1, array("c", false)) means, everything less than 1:c as per sorting I mentioned in the beginning
(6) Shortlists all values based on the first component exclusive of 0 and 2
(7) Shortlists all values based on the first component exclusive of 0 and 2 in reverse (Notice $reversed set to true)
// Fetch a user record
$row = $cf->get($key1); //(1)
$col1 = $row[0];
list($name, $value) = $col1; //(2)
echo "Column name: ";
print_r($name); //(3)
echo "Column value: ";
print_r($value);
echo "\n\n";

// Fetch columns with a first component of 1
$slice = new ColumnSlice(array(1), array(1)); // (4)
$columns = $cf->get($key1, $slice);
foreach($columns as $column) {
    list($name, $value) = $column;
    var_dump($name); 
    echo "$value, ";
}
echo "\n\n";

// Fetch everything before (1, c), exclusive
$inclusive = False;
$slice = new ColumnSlice('', array(1, array("c", $inclusive))); // (5)
$columns = $cf->get($key1, $slice);
foreach($columns as $column) {
    list($name, $value) = $column;
    echo "$value, ";
}
echo "\n\n";

// Fetch everything between 0 and 2, exclusive on both ends
$slice = new ColumnSlice( // (6)
    $start = array(array(0, False)),
    $end   = array(array(2, False))
);
$columns = $cf->get($key1, $slice);
foreach($columns as $column) {
    list($name, $value) = $column;
    echo "$value, ";
}
echo "\n\n";

// Do the same thing in reverse
$slice = new ColumnSlice(    //(7)
    $start = array(array(2, False)),
    $end   = array(array(0, False)),
    $count = 10,
    $reversed = True
);
$columns = $cf->get($key1, $slice);
foreach($columns as $column) {
    list($name, $value) = $column;
    echo "$value, ";
}
echo "\n\n";

// Clear out the column family
$cf->truncate();

// Destroy our schema
$sys->drop_keyspace("Keyspace1");

// Close our connections
$pool->close();
$sys->close();
Actually this version of PHPCassa is an awesome revamp from its later version.
  • This has come out with Thrift 0.8 Support
  • Composite Type Support [no more serialize or unserialize required ;)]
  • Full Support for Batch Mutate
  • Implementation using namespaces
  • All new API Reference
  • And Complete Examples
Awesome work by Tyler Hobbs :)
Hope this helps :)

Javascript ASI and join vs concat(+)

Javascript Automatic Semicolon Insertion


I came across a nice implication of Automatic Semicolon Insertion while developing an API in javascript.

I'll let you guess at first as usual. Try the following
function asi() {
    var a = 10,
    b = 20
    c = 30;
    this.log = function () {
        console.log(a,b,c);
    };
    this.set = function (A,B,C) {
        a=A;
        b=B;
        c=C;        
    }
}

var a = new asi();
a.log();
var b = new asi();
b.log();
a.set(11,21,31);
b.log();
b.set('This', 'is', 'wrong');
a.log();

//Expected output
10 20 30
10 20 30
10 20 30
11 21 31

//What happened??
10 20 30
10 20 30
10 20 31
11 21 wrong

How Come?
First Thing to note:
See Closely at line 3 there is a comma operator missing. So, now parser will decide what to do :P
Remember:
Whenever a statement misses a semicolon and if the statement following it makes sense along with the former.
Then JS engine will not place a semicolon. Perhaps it parse them as a single statement. [Refer My Prev Post ]
Implication:
var a=10,b=20 remains a incomplete statement without a semicolon
var a=10,b=20c=30; doesn't makeout a valid javascript statement. So ASI makes it var a=10,b=20;c=30; [converse of the above]
Finally:
The variable c is assigned before declaration in the scope of function ASI
Remember:
If a variable is used before declaring it in function scope &&
If the variable is not declared anywhere in the scope chain of the function
Then it will become a property of the Global Scope or the window
Implication:
Hence, variable 'c' is assumed to be declared in the global scope rather in function ASI()

That is all to say about it :)
Better don't save semicolons :P
Use them whenever is needed.
So, that you will be able to trace back errors nicely in case of unintentional errors like the above.

Next is about Join Vs Concat(+)


I was going through many of the test regarding this context @jsperf
I inferred that in all modern browsers (+) for concatenation is optimized in a really nice way [in some cases (+) was 100 times better than join].

But I think the better criteria to choose one among them should be the usecase.
Because both of them will be able to do job in less than a ms

Following is an example why do I feel join is safer than (+)
function whyJoin() {
    var a, b, c, delim = '&';
    return {
        setter: function (A, B, C) {
            a = A;
            b = B;
            c = C;            
        },
        concatMe: function () {
            return a+delim+b+delim+c;
        },
        joinMe: function () {
            return [a, b, c].join(delim);
        }        
    };
}

var test = whyJoin();
test.setter('This', 'is', 'Good');
var c = test.concatMe();
var j = test.joinMe();
console.log(c.split('&')); 
console.log(j.split('&'));
test.setter('This', 'is');
c = test.concatMe();
console.log("Doesnt look good", c);
j = test.joinMe();
console.log("Seems Fine", j); 
console.log(c.split('&')); 
console.log(j.split('&'));
test.setter('This', 'is', null);
c = test.concatMe();
console.log("Doesnt look good", c);
j = test.joinMe();
console.log("Seems Fine", j); 
console.log(c.split('&')); 
console.log(j.split('&'));
If you had noticed your console following will be the output
["This", "is", "Good"]
["This", "is", "Good"]
Doesnt look good This&is&undefined
Seems Fine This&is&
["This", "is", "undefined"]
["This", "is", ""]
Doesnt look good This&is&null
Seems Fine This&is&
["This", "is", "null"]
["This", "is", ""]

Hope you noticed, In Concat(+) undefined or null is converted to their string equivalent and appended which might be undesired in some cases.

Also join keeps things clear & clean.
For ex: If delimiter is going to be the same across all concatenation or operands already exists as an array.

Concat(+) is really useful in many cases
For ex: If the concatenation is not based on some delimiters & number of concatenation operations is less
var result = '
  • '+param+'
  • ';
    In some cases I feel using both keep things clear.
    For ex:
    for(some condns) {
        result += [param, param, param].join('&');
    }
    

    But Google Optimization suggests creating a string builder for the above case.
    function stringBuilder() {
        this.needls = [];
    }
    stringBuilder.prototype.push = function (needle) {
        this.needls.push(needle);
    };
    stringBuilder.prototype.build = function () {
        var result = this.needls.join('');
        this.needls = [];
        return result;
    };
    var strBuilderInstance = new stringBuilder();
    for(some cdns) {
         strBuilderInstance.push([param, param, param].join('&'));
    }
    var result = strBuilderInstance.build();
    

    All the tests performed in jsperf are performance test :)
    Better decide things based on your usecase because javascript is fast enough but the DOM is taking all the time :) -> Douglas Crockford

    Tuesday, March 13, 2012

    Is Javascript Pass By Reference or Pass By Value?

    Javascript - Pass By Reference or Value?
    Javascript Types:
    string, number, boolean, null, undefined are javascript primitive types.
    functions, objects, arrays are javascript reference types.

    Difference?
    One of them is pass by reference and value.

    I'm Considering string from primitive type and object from reference type for explanation.

    Try guessing the alerts in the following examples yourself before reaching the answers

    //Example 1
    function test(student) {
        student = 'XYZ';
        alert(student);
    }
    
    var a = 'ABC';
    test(a);
    alert(a); 
    
    //Example 2
    function test(student) {
        alert(student.name);
        student.marks = 10;
        student.name = 'XYZ';
    }
    
    var a = {name:'ABC'};
    test(a);
    alert(a.name);
    
    //Example 3
    function test() {
        var student;
        return {
         setter: function (a) {
             student = a;   
         },
         getter: function () {
             return student;
         },
         change: function () {
             student.name = 'XYZ';
         }
       }
    }
    
    
    var a = {name:'ABC'};
    var b = test();
    b.setter(a);
    a.name = 'DEF';
    alert(b.getter().name);
    b.change();
    alert(a.name);
    
    //Example 4
    function test() {
        var student;
        return {
         setter: function (a) {
             student = a;   
         },
         getter: function () {
             return student;
         },
         change: function () {
             student.name = 'XYZ';
         }
       }
    }
    
    var a = {name:'ABC'};
    var b = test();
    b.setter(a);
    a = {name:'DEF'};
    alert(b.getter().name);
    b.change();
    alert(a.name);
    

    Try reasoning why are they so if you are wrong, before reaching the explanation

    //Example 1
    alert 1: XYZ
    alert 2: ABC
    
    //Example 2
    alert 1: ABC
    alert 2: XYZ
    
    //Example 3
    alert 1: DEF
    alert 2: XYZ
    
    //Example 4
    alert 1: ABC
    alert 2: DEF
    

    Reason

    Example 1:
    string is a primitive type & variables hold the values for primitive types.
    Primitives are passed by value in javascript.
    So, change in 'student' will never affect 'a' and vice versa

    Example 2:
    object is a reference type & variables hold the reference rather value for reference types
    Reference types are passed by reference in javascript
    Both a & student refer to the same object. Hence change in 'student' reflects over 'a'.
    **Remember reference of 'student' never points to 'a', it follows the reference chain of 'a' and starts referring to the core object.

    Example 3:
    Reason same as Example 2
    Created this example to show that change made in 'a' too reflects back in 'student'

    Example 4:
    This example tests your real understanding.
    If you can reason this out now, then you can grade your self good in this topic

    Although a's reference has changed to new object, it didn't affect student's reference.
    Reason: point to remember mentioned in Example 2.
    You have to understand this clearly :)

    Hope I had done some justice. The same can be proved for other primitive and reference types :)

    You actually don't need function calls to prove these concepts.

    //Equivalent to Example 1
    var a = 10;
    var b = a;
    a = 20;
    console.log(a, b);
    
    //Equivalent to Example 2 & 3
    var a = {value:1};
    var b = a;
    a.name = 'xyz';
    b.value = 10;
    console.log(a, b, a===b);
    
    //Equivalent to Example 4
    var a = [1,2,3];
    var b = a;
    b[0] = 10;
    console.log(a, b, a===b);
    a = [5,6,7];
    console.log(a, b, a===b);
    

    But I used to function to explain the effects over 'passing' rather 'assignment|copying' :)