Random Thoughts on Technology: September 2010

Tuesday, September 28, 2010

Dynamic Lanugage Runtime

The dynamic language runtime (DLR) is a runtime environment which sits on top of the common language runtime (CLR) and provides a set of services for dynamic languages to run on .Net framework. It includes a set of Libraries and constructs to allow the objects to be identified at run time compared to the statically typed languages like c# where the object types have to be defined at compile time.

Scripting/Interpreted languages like JavaScript, PHP can be good examples for Dynamic Languages.

Other Popular examples are Lisp, Smalltalk, Ruby, ColdFusion

Primary Advantages of DLR

Simplifies Porting Dynamic Languages to the .NET Framework
Enables Dynamic Features in Statically Typed Languages
Enables Sharing of Libraries and Objects
Provides Fast Dynamic Dispatch and Invocation

Architecture

The main services DLR offers to CLR include the following:

Expression trees: Used to represent Language semantics
Call site caching: It caches information about operations and types if already executed, so as to achieve faster processing.
Dynamic object interoperability: Provides a set of classes for the language implementers to use & extend their interoperability with .net

Source: http://msdn.microsoft.com/en-us/library/dd233052.aspx

Thursday, September 23, 2010

Overview of Entity Framework 4.0

"The Entity Framework bridges the gap between how developers commonly manipulate conceptual objects (customers, orders, products; posts, tags, members; wall posts, private messages, friend connections) and the way data is actually stored (records in database tables). The technical term for a tool that provides this abstraction is object relational mapper (ORM)."

This blog gives a gist of what EF (Entity Framework 4) has to offer and how to program against it.

Best place to start http://msdn.microsoft.com/en-us/data/aa937723.aspx
For beginners, I would recommend reading http://blogs.msdn.com/b/adonet/archive/2010/07/19/absolue-beginners-guide-to-entity-framework.aspx

I had been going through Entity framework lately to implement it in our next Project. To my surprise, the programming with Data (DB) has become very simple using this framework. It helps the developer to focus more on understanding the Business Domain and model the Business Entities than worry about the way to store and access the Database.

The framework provides ways to generate DB directly from the Modelled Entities. This approach is called Model-First approach and is usually recommended. However vice versa, DB to Entities can also be created. This helps if you already have the DB ready & still want to leverage on using the Framework.

LINQ queries or Lamda expressions are mostly used to perform CRUD operations against Business Entities instead directly against the Database.

If you open the .edmx file (Entity Model file) in an XML editor then you will basically see the following sections.
    * Storage Model - Defines the Entities, Entity Sets and Associations. All information required to create the Database will be picked from here.
    * Conceptual Model - Defines the Entities, Entities Sets and Associations that will be consumed from the Business Layer. Information for modelling (Diagram) will be picked from here.
    * Mappings/Associations - Mappings between the Storage and the Conceptual Model is defined here.

EntitySet is a pluralized version of the Entity.Few base classes that you need to be aware of are
    * ObjectContext is a base class for the Entity Container. Used like an container for Entites.
    * EntityContext is a base class for the Entity class.

By default, EF uses Lazy loading approach. It executes the queries as a single batch of commands. However there are ways to make explicit execution of the Query.

Few commands to Query, Update are like follows:
ctx.Contacts.Where(c=>c.SalesOrderHeaders.Any()).ToList()
ctx.Customers.AddObject(customer);
ctx.SalesOrderDetails.DeleteObject(order);
ctx.SaveChanges() - Triggers the execution of the Query.

There are 3 ways of programming against the Entitiy Framework.

LINQ to Entities - Write LINQ queries to perform operations on Entities.
Entity SQL - Use SQL strings as commands. However you are writing commands against the Entities and not DB.
Query Builder - Use the Methods provided with Entities Framework instead LINQ.

There are times when you would like to do some complex set of operations on a varied set of tables while interacting with Entities. This is when you could leverage on Entities to Stored Procedures. However there are some limitations on using Stored Procedure with Entities. (Something like if one of the operation, say Insert is connected to Stored Proc then other operations also have to be linked through Stored Proc).

Tracing the SQL commands.
During debug you would like to see what is the DB query that is being converted to for your LINQ operation. Usual approach is to use the SQL Profiler.

This could be time consuming to switch between VS 2010 and SQL Server. You can leverage on the Programming model tracing using System.Data.Objects.ObjectQuery.ToTraceString and System.Data.EntityClient.EntityCommand.ToTraceString methods, which enable you to view these store commands at runtime without having to run a trace against the data source.

LINQ TO ENTITIES
// Define an ObjectSet to use with the LINQ query.
ObjectSet products = context.Products;
// Define a LINQ query that returns a selected product.
var result = from product in products
where product.ProductID == productID
select product;
// Cast the inferred type var to an ObjectQuery
// and then write the store commands for the query.
Console.WriteLine(((ObjectQuery)result).ToTraceString());

ENTITY SQL
// Define the Entity SQL query string.
string queryString =
@"SELECT VALUE product FROM AdventureWorksEntities.Products AS product
WHERE product.ProductID = @productID";
// Define the object query with the query string.
ObjectQuery productQuery =
new ObjectQuery(queryString, context, MergeOption.AppendOnly);
productQuery.Parameters.Add(new ObjectParameter("productID", productID));
// Write the store commands for the query.
Console.WriteLine(productQuery.ToTraceString());

QUERY BUILDER
int productID = 900;
// Define the object query for the specific product.
ObjectQuery productQuery =
context.Products.Where("it.ProductID = @productID");
productQuery.Parameters.Add(new ObjectParameter("productID", productID));
// Write the store commands for the query.
Console.WriteLine(productQuery.ToTraceString());

You can retrieve objects from Entities using GetObjectByKey and TryGetObjectByKey methods on ObjectContext. This will return an object with the specified EntityKey into the object context. When you use GetObjectByKey, you must handle an ObjectNotFoundException.

The abstraction EF

http://msdn.microsoft.com/en-us/library/cc853327.aspx
http://blogs.msdn.com/b/adonet/archive/2008/02/11/exploring-the-performance-of-the-ado-net-entity-framework-part-2.aspx

What one must do to improve performance is to use Compiled LINQ to precompile the LINQ queries to ensure faster operations.
http://msdn.microsoft.com/en-us/library/bb896297.aspx will provide more information on this.

EF gets only the Entities data without it's related assoicated entities data. Ex: There may be a case where you would want to retrieve SalesPersons along with their SalesOrders Details.

There are ways you could inform the EF to retrieve all the related entities's data so that you do not end up using a foreach loop and trying to fill the related entities data. This would have resulted in far too many DB calls.

Using Query Paths you can preload the related Entities.

//When a n-level details have to be retrieved.
var contacts = (from contact in context.Contacts.Include("SalesOrderHeaders.SalesOrderDetails")
select contact).FirstOrDefault();

//When unrelated tables has to be included.
ObjectQuery query =
context.SalesOrderHeaders.Include("SalesOrderDetails").Include("Address");

Security
Can't end the article without mentioning about Security. I Just evaluated on the classic problem of SQL Injection. Your old techniques of having Parameterized Query is still valid in Entity Framework.

--SQL Injection possible.
context.ExecuteStoreQuery("select * from Products where pid = {0}", 1);
--Guarded against SQL Injection
context.ExecuteStoreQuery("select * from Products where pid = @p0", new SqlParameter { ParameterName = "p0", Value = 1 })

Overall, I am sure this is going to reduce the developer's work however adds a little more decipline to ensure that developers do not just plainly treat the entities just as a set of tables.

So far that's it I was able to read and evaluate about, Entity Framework. Hopefully I will add an advanced version of this article where I will try to touch base on the Transaction and Concurrency related stuff and little more detail on the coding with more snippets. Till then, Happy blogging!! :)

Don't forget to start reading about http://msdn.microsoft.com/en-us/data/aa937723.aspx

Saturday, September 11, 2010

Windows Server AppFabric Caching Framework

Windows Server AppFabric Caching, also called as Velocity is a framework for providing a unified Cache. Application Caching is nothing new and has been around for many years. It definitely saves those milliseconds or even more depending on the number of concurrent users fetching the data.

Earlier Cache used to be part of the Web Servers. Later, they moved into the Application Servers. However, the capabilities of the caching depended on the capacity of the Application server. With more and more concurrent users, it became important to have a much bigger caching servers. In order to prevent any bottlenecks on these servers, there was a need for a Caching framework which could provide all the capabilities of a load balanced Web Servers.

The following are some of the functionalities that was expected and delivered from the Velocity framework:

Load Balance of the Cache Servers
Provision to scale dynamically without having to stop the applications
High availability in case some Cache Servers goes down
Maintaining consistency in data copies stored across Cache Servers
Provide a mechanism to invalidate the Cache when the actual data store gets changed

To Install & Configure the framework follow the link http://msdn.microsoft.com/en-us/library/ff383731.aspx

There are two places of storing the configuration.

Network Shared folder - Usually smaller applications (1-5 Servers)
SQL Server - For larger Enterprises (greater than 5 Servers)

Configuration can be done either Programmatically or by using the Windows PowerShell. The AppFabric/Velocity framework related commandlets will be installed as part of the framework.

The following are some of the terminologies that are used in Velocity Framework

Cache Server
Cache Host - Windows Service
Cache Client - Web Application Accessing the Cache
Local Cache - Data is stored in memory of the Web Application
Server Cache - Data is serialized and saved in servers other than Web Application Server
Lead Host - Responsible for storing Cache and also to co-ordinate with other Hosts for managing the integrity of the other Cache Servers.
Cache Cluster - A set of Cache hosts working together to provide a unified Cache view

There are two ways of partitioning a Cache. You could configure data to go into one of these partitions to effectively manage the cache for performance and also to isolate the invalidating effect on other cache data.

Named Cache
Named Region

Memory management

There are two ways of invalidating the Cache.

Timeout
Notification (Default polling interval of 300 secs) - Will check for any notifications programmatically.

Periodically, invalidated cache gets cleaned up for effective Memory management. However, there may be cases where the framework may choose to remove cache data when there is a crunch for memory. You application may get exceptions when they are programmed by completely expecting the data to be in cache. Hence it is a must to write your applications for such events.

The eviction strategy used by Velocity framework is LRU (Least recently) used. i.e. Old data gets destroyed first.

High availability is achieved by making copies of the Cache data. The number of copies that needs to be maintained can be configured.

Security
Security is done at Transport level. Velocity framework can be configured to allow only certain user-context to access the cache servers. You should allow the user context of the Web Application - Application Pool to have access to the Cache servers.

Lastly, ASP.net 4.0 applications can leverage on the Velocity framework for storing the Session states.

Hopefully, I have touched some concepts of the new caching framework. Can't wait to implement this on my next project.

References:

http://msdn.microsoft.com/en-us/library/ee790954.aspx

http://www.hanselman.com/blog/CategoryView.aspx?category=AppFabric