Random Thoughts on Technology

Saturday, January 1, 2011

All about Claims based Authentication, Power Pivot Setup and Data Refresh

There are already tons of articles to help you out in setting up the PowerPivot, however each setup has its own set of challenges. In this article I am going to highlight the issues I faced and I am sure it can save someone that 1 week of time that I spent on it.

At the end of the article I will also publish a few references that helped me in my setup.

Claims based authentication is the only way to create your web application if you want to support Forms Authentication. i.e. any Authentication provider that is not tied to Active Directory (in which case you could chose your web application to be Classic mode)

If you are reading this article thinking that you can setup Power Pivot on your Claims based Web application, you would be sorry to hear that it is NOT supported. I learnt it the hard way.

You can still view the Power Pivot reports published without any trouble but, you will not be able to take the advantage of the Data Refresh (On Demand/Scheduled) capabilities of Power Pivot.

However, I was able to get it working on my laptop (DC, SP Server, DB & Analysis Server). But, with a limitation that I needed to be a Domain Admin & Farm Admin. It worked only through a unattended account which was tied to a Secure Store Service Application ID configured with the Domain Account.

The same did not work on a multi box environment. Not able to understand why? I was working with a Microsoft Tech support to resolve this though. Which I think now is futile.

For the installation problems, you can refer to my previous article:

http://prashanthkn.blogspot.com/2010/10/troubleshoot-on-setup-of-powerpivot-for.html

Either you are setting up the PowerPivot on Claims/Classic based Web applicaiton, you will definitely want to take care of the following:

1. Ensure that you have deployed the PowerPivotFarm.wsp globally & PowerPivotWebApp.wsp for the Web Application you are enabling Power Pivot capabilities.

See Central Administration > System Settings > Farm Management > Manage Farm Solutions >

2. Ensure that Secure Store Service, SQL Server Analysis Services, SQL Server PowerPivot System Service, Timer Service are started.

See Central Administration > System Settings > Servers > Manage Services on Server

3. Few Features have to be activated in the Site collection.

PowerPivot Feature Integration for Site Collections

SharePoint Server Publishing Infrastructure

SharePoint Server Publishing

See, Site Settings > Site Collection Administration > Site Collection Features >

4. If you have changed the master page then you may end up failing to load the power pivot gallery. Add the following in the VirtualDirectory\{app port}\web.config

5. For data refresh ensure that the application pool account of the Power Pivot Service Application has enough privileges.

6. You may need to setup Kerberos (or enable for Delegation in Active Directory) when you have multiple box setup. This one I am not sure yet.:)

Some settings that you need to be aware of for the Power Pivot.

1. Creating Secure Store Service Application ID

2. Linking that in the Power Pivot Excel Workbook while publishing it. I may write another article very soon.

3. Central Administration > PowerPivot > Configure Service Application Settings

4. Central Administration > Monitoring > Timer Jobs > Review Job Definitions > PowerPivot Data Refresh Timer Job

References:

http://technet.microsoft.com/en-us/library/ff955761.aspx

http://powerpivotgeek.com/2010/04/22/why-powerpivot-requires-classic-mode-web-applications/

http://www.powerpivotblog.nl/add-a-new-user-to-your-single-box-demo-machine-with-ad

http://powerpivotgeek.com/2010/09/08/a-peek-inside-getting-the-most-from-data-refresh/

http://powerpivotgeek.com/2009/11/12/steps-taken-during-a-powerpivot-data-refresh/

http://technet.microsoft.com/en-us/library/ff628113(SQL.100).aspx

http://powerpivot-info.com/post/66-step-by-step-guide-on-installing-powerpivot-for-sharepoint

http://powerpivotgeek.com/troubleshooting/diagnostic-guide-for-usage-and-the-powerpivot-management-dashboard/

http://powerpivotgeek.com/2010/09/08/a-peek-inside-getting-the-most-from-data-refresh/

http://consultingblogs.emc.com/stevewright/archive/2010/04/09/configuring-excel-services-amp-powerpivot-on-multi-server-topology.aspx

Hope this information was useful. Look forward for your experiences in setup.

Wednesday, October 13, 2010

Troubleshoot on Setup of PowerPivot for SharePoint

I know it can be really frustrating when you spend long hours just trying to setup something on your developer box. Especially, when you had followed the instructions exactly (maybe not) as in the MSDN article.

The following MSDN article guides you to setup PowerPivot for SharePoint.

http://msdn.microsoft.com/en-us/library/ee210616.aspx#Geneva

I will try to highlight some of the gotchas that you need to watch out for, while you perform the setup.

The following were the configuration that I have on my dev box.

Domain Controller with Active Directory
Windows Server 2008 R2
SharePoint 2010
SQL Server 2008 R2 Dev edition (With SSIS, SSAS & SSRS)
Office 2010
.Net 3.5 SP1 or above
PowerPivot Add-in for Excel 2010 (http://www.powerpivot.com/download.aspx)

While following the Setup article, you may find some weird error like.

Failed to load receiver assembly "Microsoft.AnalysisServices.SharePoint.Integration, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" for feature "PowerPivot" (ID: f8c51e81-0b46-4535-a3d5-244f63e1cab9).: System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.AnalysisServices.SharePoint.Integration, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' or one of its dependencies. The system cannot find the file specified.
File name: 'Microsoft.AnalysisServices.SharePoint.Integration, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91'

If you see the above error then you will be really interested in the following detail of important steps.

In the Installation Type step of the wizard, select “New installation or add shared features”. DO NOT select the “Add features to an existing instance…” option.
In the Setup Role step in the wizard, select the “SQL Server PowerPivot for SharePoint” radio button. DO NOT select the SQL Server Feature Installation option.In the Instance configuration step, keep the Instance ID as POWERPIVOT. Keep the SQL Server Analysis Services of default MSSQLServer seperate from this instance required for PowerPivot
In the Service configuration step, provide a domain account that has rights on both the SQL instances. I know it is not a good practice, but what the hell, this is supposed to be a developer box. I used the domain admin :)
In the Read to Install step, you will have to browser to the configuration file path shown at the bottom of the dialog. Modify the FARMADMINPORT value to point to the existing SharePoint Central Administrator port.
Before you move to the next step in the wizard,

Ensure that you have the Microsoft.AnalysisServices.SharePoint.Integration.dll in C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\SQLServer2008R2\x64. This dll can be found in GAC from one of your previous failed attempts.
Remove all instances of the Microsoft.AnalysisServices.SharePoint.Integration.dll, especially from the GAC. Find other instances of it (temp folders of GAC) and delete them as well.
Note: If you are not able to delete some file instance because some process is holding it up then find out the process and go to the Task manager or the Windows Service to terminate/stop the process. Usually it will be w3wp.exe, OWSTimer, Web Analytics services. You can use the Process explorer to identify the holding process. http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx

Good Luck here.. before you press that Install button.

Have patience in the Last step. it could take a while to finish the last bit. I had to start the SharePoint timer service thinking, maybe the FarmDeployment workflow was not getting completed because of a timer service was terminated in the step 5 (I am not sure about this. Just in case you face some problems).

A comprehensive guide to setup is avaialble here

Post the installation, for configuration you can follow the MSDN article.

Just before you leave one Tip,
Once you install PowerPivot Add-in to Excel you may see that loading of Excel becomes slow. If you may wish to enable/disable this add-in then go to Excel -> File -> Options -> Add-ins -> Manage (COM/Disabled Add-ins)

Hopefully, this information has saved you that precious time.

Now, that I have been able to setup, I am waiting to explore PowerPivot for SharePoint in detail. If I come across anything interesting then I will share my views in the next article. Till then stay tuned :)

Friday, October 8, 2010

Excel Services (SharePoint 2010)

Excel 2010 is an authoring tool while Excel Services is more of a reporting tool.

The following are the main uses:

Sharing & Collaborating spreadsheets through browser.
Report Building and publishing using dashboards.
Provide a Web-service interface to allow applications to interact with the Spreadsheets stored at Server.

What's new for Excel Services (SharePoint Server 2010)?

Multi-user Collaboration : This helps in simultaneously authoring a Workbook by more than 1 person. This feature is there in the SkyDrive. You may want to try it out.
Slicer feature : The Slicer is a Business Intelligence data filter in Microsoft Excel 2010. It helps in creating an interactive and flexible design layout for analyzing the business data while using PivotTables and OLAP functions.
Ajax Enabled : Service is Ajax enabled and hence changes to a range of cells will only require to refresh these cells instead of the entire page.
Improved User Defined Functions : This helps in creating complex calculations over cells of the workbook.
Manage Service Applications : The Service can be enabled in the Central Administration console.
Windows PowerShell : Excel Services PowerShell commands aids in Installing/Configuring in Power Shell Command Prompt without going to the Central Administrator console. This also helps in automating the build and setup of the application.
Trusted Locations : Ensures that all the Workbooks that get loaded are Trusted. These locations are created by default.
Unattended Service Account : This is a special low privilege account usually will have only Read access. This will be used when the Authentication has been set to None.
Client Applications: Apart from using the regular Excel services APIs, the service can be exposed through REST (Representational State Transfer) API. You can embed the excel content onto other office apps or on web. Users can also use JSOM or ECMAScript to share Excel content to allow interactive behavior through web browsers.

Architecture:

Excel Calculation Services : is the engine that loads the spreadsheet and workbook, calculates the spreadsheets, updates external data, and maintains session state for interactivity.
Excel Web Access : is a Web Component component that delivers the Excel workbooks.
Excel Web Services : is a Web Service hosted in SharePoint that provides various methods for developers to create custom applications that are built on the Excel workbook.

To setup and start using the Excel Services you need to understand the following concepts:

Connections and Excel workbooks:

Excel workbooks can contain two types of connections.

Embedded connections are stored internally as part of the workbook.
Linked connections are stored externally as separate files that can be referenced by a workbook. Linked connection files can be centrally stored, secured, managed, and reused.

Office data connection files (.odc) are the connection files used in Excel Services Application.

Trusted File locations Excel SA only loads workbooks from trusted file locations. An administrator has to explicitly mark a directory (SharePoint document libraries, UNC paths, or HTTP Web sites) as a trusted file location. These directories are added to a list that is internal to Excel SA.

Trusted Data providers These are external databases that Excel Calculation Services is explicitly configured to trust when it is processing data connections in workbooks.

Trusted Data connection libraries These are SharePoint document libraries that contain Office data connection (.odc) files.

Authentication: Excel Services Application accesses external data sources by using a delegated Windows identity.

There are three modes of Authentication to the data source

Windows Integrated (Requires servers to be on same domain. Kerberos setup required if there is a double hop)
Secure Storage
None

Cache:

Excel Services comes with a cache to prevent the load/refresh of the data from the data source for every request. However at times this cache may have to be reloaded for every request. This can be enabled to reload on every open. This will help if different data is to be shown for different set of users. A cache expiration time also can be set to invalidate the cache and reload new data.

References:

http://technet.microsoft.com/en-us/library/ee424405.aspx

http://technet.microsoft.com/en-us/library/ee424401.aspx

Video - http://www.msteched.com/2010/NorthAmerica/BIU06-INT

Tuesday, September 28, 2010

Dynamic Lanugage Runtime

The dynamic language runtime (DLR) is a runtime environment which sits on top of the common language runtime (CLR) and provides a set of services for dynamic languages to run on .Net framework. It includes a set of Libraries and constructs to allow the objects to be identified at run time compared to the statically typed languages like c# where the object types have to be defined at compile time.

Scripting/Interpreted languages like JavaScript, PHP can be good examples for Dynamic Languages.

Other Popular examples are Lisp, Smalltalk, Ruby, ColdFusion

Primary Advantages of DLR

Simplifies Porting Dynamic Languages to the .NET Framework
Enables Dynamic Features in Statically Typed Languages
Enables Sharing of Libraries and Objects
Provides Fast Dynamic Dispatch and Invocation

Architecture

The main services DLR offers to CLR include the following:

Expression trees: Used to represent Language semantics
Call site caching: It caches information about operations and types if already executed, so as to achieve faster processing.
Dynamic object interoperability: Provides a set of classes for the language implementers to use & extend their interoperability with .net

Source: http://msdn.microsoft.com/en-us/library/dd233052.aspx

Thursday, September 23, 2010

Overview of Entity Framework 4.0

"The Entity Framework bridges the gap between how developers commonly manipulate conceptual objects (customers, orders, products; posts, tags, members; wall posts, private messages, friend connections) and the way data is actually stored (records in database tables). The technical term for a tool that provides this abstraction is object relational mapper (ORM)."

This blog gives a gist of what EF (Entity Framework 4) has to offer and how to program against it.

Best place to start http://msdn.microsoft.com/en-us/data/aa937723.aspx
For beginners, I would recommend reading http://blogs.msdn.com/b/adonet/archive/2010/07/19/absolue-beginners-guide-to-entity-framework.aspx

I had been going through Entity framework lately to implement it in our next Project. To my surprise, the programming with Data (DB) has become very simple using this framework. It helps the developer to focus more on understanding the Business Domain and model the Business Entities than worry about the way to store and access the Database.

The framework provides ways to generate DB directly from the Modelled Entities. This approach is called Model-First approach and is usually recommended. However vice versa, DB to Entities can also be created. This helps if you already have the DB ready & still want to leverage on using the Framework.

LINQ queries or Lamda expressions are mostly used to perform CRUD operations against Business Entities instead directly against the Database.

If you open the .edmx file (Entity Model file) in an XML editor then you will basically see the following sections.
    * Storage Model - Defines the Entities, Entity Sets and Associations. All information required to create the Database will be picked from here.
    * Conceptual Model - Defines the Entities, Entities Sets and Associations that will be consumed from the Business Layer. Information for modelling (Diagram) will be picked from here.
    * Mappings/Associations - Mappings between the Storage and the Conceptual Model is defined here.

EntitySet is a pluralized version of the Entity.Few base classes that you need to be aware of are
    * ObjectContext is a base class for the Entity Container. Used like an container for Entites.
    * EntityContext is a base class for the Entity class.

By default, EF uses Lazy loading approach. It executes the queries as a single batch of commands. However there are ways to make explicit execution of the Query.

Few commands to Query, Update are like follows:
ctx.Contacts.Where(c=>c.SalesOrderHeaders.Any()).ToList()
ctx.Customers.AddObject(customer);
ctx.SalesOrderDetails.DeleteObject(order);
ctx.SaveChanges() - Triggers the execution of the Query.

There are 3 ways of programming against the Entitiy Framework.

LINQ to Entities - Write LINQ queries to perform operations on Entities.
Entity SQL - Use SQL strings as commands. However you are writing commands against the Entities and not DB.
Query Builder - Use the Methods provided with Entities Framework instead LINQ.

There are times when you would like to do some complex set of operations on a varied set of tables while interacting with Entities. This is when you could leverage on Entities to Stored Procedures. However there are some limitations on using Stored Procedure with Entities. (Something like if one of the operation, say Insert is connected to Stored Proc then other operations also have to be linked through Stored Proc).

Tracing the SQL commands.
During debug you would like to see what is the DB query that is being converted to for your LINQ operation. Usual approach is to use the SQL Profiler.

This could be time consuming to switch between VS 2010 and SQL Server. You can leverage on the Programming model tracing using System.Data.Objects.ObjectQuery.ToTraceString and System.Data.EntityClient.EntityCommand.ToTraceString methods, which enable you to view these store commands at runtime without having to run a trace against the data source.

LINQ TO ENTITIES
// Define an ObjectSet to use with the LINQ query.
ObjectSet products = context.Products;
// Define a LINQ query that returns a selected product.
var result = from product in products
where product.ProductID == productID
select product;
// Cast the inferred type var to an ObjectQuery
// and then write the store commands for the query.
Console.WriteLine(((ObjectQuery)result).ToTraceString());

ENTITY SQL
// Define the Entity SQL query string.
string queryString =
@"SELECT VALUE product FROM AdventureWorksEntities.Products AS product
WHERE product.ProductID = @productID";
// Define the object query with the query string.
ObjectQuery productQuery =
new ObjectQuery(queryString, context, MergeOption.AppendOnly);
productQuery.Parameters.Add(new ObjectParameter("productID", productID));
// Write the store commands for the query.
Console.WriteLine(productQuery.ToTraceString());

QUERY BUILDER
int productID = 900;
// Define the object query for the specific product.
ObjectQuery productQuery =
context.Products.Where("it.ProductID = @productID");
productQuery.Parameters.Add(new ObjectParameter("productID", productID));
// Write the store commands for the query.
Console.WriteLine(productQuery.ToTraceString());

You can retrieve objects from Entities using GetObjectByKey and TryGetObjectByKey methods on ObjectContext. This will return an object with the specified EntityKey into the object context. When you use GetObjectByKey, you must handle an ObjectNotFoundException.

The abstraction EF

http://msdn.microsoft.com/en-us/library/cc853327.aspx
http://blogs.msdn.com/b/adonet/archive/2008/02/11/exploring-the-performance-of-the-ado-net-entity-framework-part-2.aspx

What one must do to improve performance is to use Compiled LINQ to precompile the LINQ queries to ensure faster operations.
http://msdn.microsoft.com/en-us/library/bb896297.aspx will provide more information on this.

EF gets only the Entities data without it's related assoicated entities data. Ex: There may be a case where you would want to retrieve SalesPersons along with their SalesOrders Details.

There are ways you could inform the EF to retrieve all the related entities's data so that you do not end up using a foreach loop and trying to fill the related entities data. This would have resulted in far too many DB calls.

Using Query Paths you can preload the related Entities.

//When a n-level details have to be retrieved.
var contacts = (from contact in context.Contacts.Include("SalesOrderHeaders.SalesOrderDetails")
select contact).FirstOrDefault();

//When unrelated tables has to be included.
ObjectQuery query =
context.SalesOrderHeaders.Include("SalesOrderDetails").Include("Address");

Security
Can't end the article without mentioning about Security. I Just evaluated on the classic problem of SQL Injection. Your old techniques of having Parameterized Query is still valid in Entity Framework.

--SQL Injection possible.
context.ExecuteStoreQuery("select * from Products where pid = {0}", 1);
--Guarded against SQL Injection
context.ExecuteStoreQuery("select * from Products where pid = @p0", new SqlParameter { ParameterName = "p0", Value = 1 })

Overall, I am sure this is going to reduce the developer's work however adds a little more decipline to ensure that developers do not just plainly treat the entities just as a set of tables.

So far that's it I was able to read and evaluate about, Entity Framework. Hopefully I will add an advanced version of this article where I will try to touch base on the Transaction and Concurrency related stuff and little more detail on the coding with more snippets. Till then, Happy blogging!! :)

Don't forget to start reading about http://msdn.microsoft.com/en-us/data/aa937723.aspx

Saturday, September 11, 2010

Windows Server AppFabric Caching Framework

Windows Server AppFabric Caching, also called as Velocity is a framework for providing a unified Cache. Application Caching is nothing new and has been around for many years. It definitely saves those milliseconds or even more depending on the number of concurrent users fetching the data.

Earlier Cache used to be part of the Web Servers. Later, they moved into the Application Servers. However, the capabilities of the caching depended on the capacity of the Application server. With more and more concurrent users, it became important to have a much bigger caching servers. In order to prevent any bottlenecks on these servers, there was a need for a Caching framework which could provide all the capabilities of a load balanced Web Servers.

The following are some of the functionalities that was expected and delivered from the Velocity framework:

Load Balance of the Cache Servers
Provision to scale dynamically without having to stop the applications
High availability in case some Cache Servers goes down
Maintaining consistency in data copies stored across Cache Servers
Provide a mechanism to invalidate the Cache when the actual data store gets changed

To Install & Configure the framework follow the link http://msdn.microsoft.com/en-us/library/ff383731.aspx

There are two places of storing the configuration.

Network Shared folder - Usually smaller applications (1-5 Servers)
SQL Server - For larger Enterprises (greater than 5 Servers)

Configuration can be done either Programmatically or by using the Windows PowerShell. The AppFabric/Velocity framework related commandlets will be installed as part of the framework.

The following are some of the terminologies that are used in Velocity Framework

Cache Server
Cache Host - Windows Service
Cache Client - Web Application Accessing the Cache
Local Cache - Data is stored in memory of the Web Application
Server Cache - Data is serialized and saved in servers other than Web Application Server
Lead Host - Responsible for storing Cache and also to co-ordinate with other Hosts for managing the integrity of the other Cache Servers.
Cache Cluster - A set of Cache hosts working together to provide a unified Cache view

There are two ways of partitioning a Cache. You could configure data to go into one of these partitions to effectively manage the cache for performance and also to isolate the invalidating effect on other cache data.

Named Cache
Named Region

Memory management

There are two ways of invalidating the Cache.

Timeout
Notification (Default polling interval of 300 secs) - Will check for any notifications programmatically.

Periodically, invalidated cache gets cleaned up for effective Memory management. However, there may be cases where the framework may choose to remove cache data when there is a crunch for memory. You application may get exceptions when they are programmed by completely expecting the data to be in cache. Hence it is a must to write your applications for such events.

The eviction strategy used by Velocity framework is LRU (Least recently) used. i.e. Old data gets destroyed first.

High availability is achieved by making copies of the Cache data. The number of copies that needs to be maintained can be configured.

Security
Security is done at Transport level. Velocity framework can be configured to allow only certain user-context to access the cache servers. You should allow the user context of the Web Application - Application Pool to have access to the Cache servers.

Lastly, ASP.net 4.0 applications can leverage on the Velocity framework for storing the Session states.

Hopefully, I have touched some concepts of the new caching framework. Can't wait to implement this on my next project.

References:

http://msdn.microsoft.com/en-us/library/ee790954.aspx

http://www.hanselman.com/blog/CategoryView.aspx?category=AppFabric