Error indexing Blog content after upgrade to 4.1 SP1
We recently upgraded a launched 4.0 site to the 4.1 SP1 build (4.1.1395.0) as we were unable to index static HTML content (we would simply receive 'object not set to an instance of an object' within a JavaScript dialog). This upgrade fixed the indexing of static HTML content, but now we are unable to index Blog content -- During the re-indexing process we receive the same 'object no set to an instance of an object' error. This is incredibly frustrating to have one item fixed and another break without reason.
Some more information on how our Sitefinity 4.x instance is set up: There are four Blogs, two which are populated through importing an external feed, and two which the site's content authors create posts for. Since launch the authors have generated over 500 different posts (at a rate of 2-3 a day for each of the blogs). I can narrow the issue down to one (or more) posts that were created within one of the blogs after February 1st, 2011 -- By deleting all posts after 2011-02-01 for that particular blog and performing the re-indexing it works successfully. However, this does not help as Sitefinity does not report which of the items it's trying to index is in error. Furthermore, nothing is generated within the log files located at ~\App_Data\Sitefinity\Logs\.
Attached are two screen captures: the first being the configuration of the search index, and the second the error dialog shown when the re-indexing operation fails.
Is there any options for performing diagnostics on the search indexing to narrow down where this error is occurring? Is there anything I can do to resolve this?
Hi Anthony,
I'm sorry to hear about the troubles you are experiencing after the upgrade. This is not a common scenario after upgrading to our 4.1 SP1 version, could you please use some tool (e.g. Firebug) and check if it will provide some more infrormation on the error? Also, you mentioned you were able to narrow your serach down to a certain selection of blog posts, could you share with me if there is any particular change introduced in these posts or you discovered this randomly?
Kind regards,
Boyan Barnev
the Telerik team
Hi Boyan,
Sorry for my late response but I've was re-allocated to other projects and only now have been able to re-visit this issue.
As asked I've enabled both FireBug and Fiddler to track the request/response data sent during the re-index request. However, this doesn't appear to provide any additional information on the source of the error -- All that's returned is a single key/value in the form of "Detail"="Object reference not set to an instance of an object". I'm including this below for reference:
Request
PUT http://localhost:
62748
/Sitefinity/Services/Publishing/PublishingService.svc/reindex/
821
a
334
f-d
9
a
7
-4
e
8
a
-84
e
1
-667
a
602
afacc/ HTTP/
1.1
Host: localhost:
62748
User-Agent: Mozilla/
5.0
(Windows; U; Windows NT
6.1
; en-US; rv:
1.9
.
2.17
) Gecko/
20110420
Firefox/
3.6
.
17
Accept: text/html,application/xhtml+xml,application/xml;q=
0.9
,*/*;q=
0.8
Accept-Language: en-us,en;q=
0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO
-8859
-1
,utf
-8
;q=
0.7
,*;q=
0.7
Keep-Alive:
115
Connection: keep-alive
X-Requested-With: XMLHttpRequest
Content-Type: application/json; charset=UTF
-8
Referer: http://localhost:
62748
/Sitefinity/Administration/Search
Content-Length:
2
HTTP/
1.1
500
Internal Server Error
Server: ASP.NET Development Server/
10.0
.
0.0
Date: Mon,
13
Jun
2011
14:
59:
22
GMT
X-AspNet-Version:
4.0
.
30319
Content-Length:
66
Cache-Control: private
Content-Type: application/json
Connection: Close
"Detail"
:
"Object reference not set to an instance of an object."
With respect to narrowing down the results this was something I found randomly. When the search indexing began throwing this error I ran SQL Profiler in an attempt to find if there was a particular blog post or posts causing the error but due to the volume of queries being performed by the Sitefinity indexer I was unable to determine any single cause. I then fell back on a more traditional approach: I restored a copy of the database, ran a SQL query to delete all blog posts from a particular blog, and then ran the re-index command from the Sitefinity administrative UI. I did this for all four of the blogs created for the site and found that by deleting the content of one of the blogs everything worked. Unfortunately this is not a solution I can go forward this... How can I possibly tell a client that I need to delete all their content in order to re-index... And what would be the point of re-indexing if I had just deleted all their content?
Are there any additional debugging options available in order to find the source of the error? Are any future versions of Sitefinity 4.x going to implement additional diagnostics to assist developers or administrators in finding the source of these errors? Even if the indexer returned the unique identifier of the piece of content that the error occurred on would help. As it is reporting a standard NullReferenceException is most unhelpful.
Hi Anthony,
I'm really sorry to hear about this, I was hoping we can get some more information from the FireBug log. We are actually working on extending the logging functionality that Sitefinity offers, and we started that with version 4.1 where it is already extended. Maybe you could try running a trace log of the web services exceptions that occur
while this error reproduces. Please follow the instructions provided on this KB article.
Kind regards,
Boyan Barnev
the Telerik team
Hi Boyan, thanks for your assistance. I enabled the tracing and was able to get additional details concerning the error. The following is the stack-trace that produces the NullReferenceException when indexing:
01.
System.NullReferenceException: Object reference not set to an instance of an object.
02.
at Telerik.Sitefinity.Modules.GenericContent.DynamicLinksParser.GetItem[T](Guid id, ContentLifecycleStatus status, IQueryable`1 query)
03.
at Telerik.Sitefinity.Modules.GenericContent.DynamicLinksParser.GetContentUrl(String key, Guid id, Boolean resolveAsAbsoluteUrl, ContentLifecycleStatus status)
04.
at Telerik.Sitefinity.Web.Utilities.LinkParser.Resolve(HtmlChunk chunk, Int32 valueIndex, GetItemUrl itemUrl, ResolveUrl resolveUrl, Boolean preserveOriginalValue, Boolean resolveAbsolute)
05.
at Telerik.Sitefinity.Web.Utilities.LinkParser.ParseHtml(String html, GetItemUrl itemUrl, ResolveUrl resolveUrl, Boolean resolve, Boolean preserveOriginalValue, Boolean resolveAbsolute, ProcessChunk processChunk)
06.
at Telerik.Sitefinity.Web.Utilities.LinkParser.ResolveLinks(String html, GetItemUrl itemUrl, ResolveUrl resolveUrl, Boolean preserveOriginalValue, Boolean resolveAsAbsoluteUrl)
07.
at Telerik.Sitefinity.Publishing.Pipes.PublishingPipeBase.ConvertImportedItemForMapping(Object item)
08.
at Telerik.Sitefinity.Publishing.Pipes.PublishingPipeBase.ImportItem(Object item)
09.
at Telerik.Sitefinity.Publishing.Pipes.Content.SitefinityContentPipe.HandleItemAdded(IDataItem item)
10.
at Telerik.Sitefinity.Publishing.Pipes.Content.SitefinityContentPipe.HandleItemModified(IDataItem item)
11.
at Telerik.Sitefinity.Publishing.Pipes.Content.SitefinityContentPipe.HandleToPublishingPoint(HandleActionArgs args)
12.
at Telerik.Sitefinity.Publishing.Pipes.Content.SitefinityContentPipe.HandleItemAction(IEnumerable`1 args)
13.
at Telerik.Sitefinity.Publishing.Pipes.PublishingPipeBase.ToPublishingPoint()
14.
at Telerik.Sitefinity.Publishing.PublishingManager.InvokeInboundPushPipes(Guid publishingPointId, String providerName)
15.
at Telerik.Sitefinity.Publishing.Web.Services.PublishingAdminService.ReindexSearchContent(String providerName, String pointId)
16.
at SyncInvokeReindexSearchContent(Object , Object[] , Object[] )
17.
at System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object instance, Object[] inputs, Object[]& outputs)
18.
at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc& rpc)
19.
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc& rpc)
20.
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage41(MessageRpc& rpc)
21.
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(MessageRpc& rpc)
22.
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage31(MessageRpc& rpc)
23.
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage3(MessageRpc& rpc)
24.
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage2(MessageRpc& rpc)
25.
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage11(MessageRpc& rpc)
26.
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage1(MessageRpc& rpc)
27.
at System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean isOperationContextSet)
01.
private
static
T GetItem<T>(Guid id, ContentLifecycleStatus status, IQueryable<T> query) where T: Content
02.
03.
ParameterExpression expression4;
04.
if
(status != ContentLifecycleStatus.Live)
05.
06.
ParameterExpression expression3;
07.
if
(status == ContentLifecycleStatus.Master)
08.
09.
ParameterExpression expression;
10.
ParameterExpression expression2;
11.
T local2 = Queryable.Where<T>(query, Expression.Lambda<Func<T,
bool
>>(Expression.Equal(Expression.Property(expression = Expression.Parameter(
typeof
(T),
"i"
), (MethodInfo) methodof(Content.get_Id)), Expression.Constant(id),
false
, (MethodInfo) methodof(Guid.op_Equality)),
new
ParameterExpression[] expression )).FirstOrDefault<T>();
12.
Guid originalContentId = local2.OriginalContentId;
13.
return
Queryable.Where<T>(query, Expression.Lambda<Func<T,
bool
>>(Expression.Equal(Expression.Property(expression2 = Expression.Parameter(
typeof
(T),
"i"
), (MethodInfo) methodof(Content.get_Id)), Expression.Constant(originalContentId),
false
, (MethodInfo) methodof(Guid.op_Equality)),
new
ParameterExpression[] expression2 )).FirstOrDefault<T>();
14.
15.
return
Queryable.Where<T>(query, Expression.Lambda<Func<T,
bool
>>(Expression.AndAlso(Expression.Equal(Expression.Property(expression3 = Expression.Parameter(
typeof
(T),
"i"
), (MethodInfo) methodof(Content.get_Id)), Expression.Constant(id),
false
, (MethodInfo) methodof(Guid.op_Equality)), Expression.Equal(Expression.Convert(Expression.Property(expression3, (MethodInfo) methodof(Content.get_Status)),
typeof
(
int
)), Expression.Convert(Expression.Constant(status),
typeof
(
int
)))),
new
ParameterExpression[] expression3 )).FirstOrDefault<T>();
16.
17.
return
Queryable.Where<T>(Queryable.Where<T>(query, PredefinedFilters.PublishedItemsFilter<T>()), Expression.Lambda<Func<T,
bool
>>(Expression.Equal(Expression.Property(expression4 = Expression.Parameter(
typeof
(T),
"i"
), (MethodInfo) methodof(Content.get_Id)), Expression.Constant(id),
false
, (MethodInfo) methodof(Guid.op_Equality)),
new
ParameterExpression[] expression4 )).FirstOrDefault<T>();
18.
Hi Anthony,
Thank you for the prompt reply. Is it possible to send over a copy of your project files and a backup of the DB, so I can debug it locally and inspect this issue further? Also it would be useful if you could attach the *.e2e file that was generated whey you ran the trace log, as it's better to inspect it with the Service Trace Viewer - makes the information more readable. Thanks in advance for your cooperation.
Best wishes,
Boyan Barnev
the Telerik team
I certainly can. When zipped the database and web application source are ~107MB in size. What would be your preferred method of receiving this file?
Hello Anthony,
Would it be possible to provide me with an ftp or download link where I can download the documents from? You can drop me a line at boyan.barnev [at] telerik.com with the download link and any additional information related to the download. Thanks in advance.
Best wishes,
Boyan Barnev
the Telerik team
Hello Anthony, There was an issue with all Blog posts created by one of your users- the problematic posts had their properties (e.g. Title, Content) missing, however they did refer to content items (images in most of the cases) that were also not present in the project/database anymore. This was the reason for search indexing not working, since resolving the proper URL of those images was not possible as they were physically not present. Deleting the problematic posts has fixed the problem, and now your site indexes blog content properly. The modifications done to your project in the process of fixing the indexing functionality were the following: 1. Upgrade to latest officially supported version (build 4.1.1501) 2. Exectute the following code: 3. Build project again 4. Delete all existing search indices related to blogs 5. Create a sample page with a Blogs widget 6. Create a new search index for Blogs 7. Create a sample page containing a search widget and search results widget 8. Reindex 9. Test search functionality – everything should work properly
Just a quick follow-up: I have just replied to the email you have sent me, containig the project files. Thank you very much form providing me with them, as this really helped finding out the reason behind this behavior. You can find a detailed explanation in my e-mail reply, and for your convenience I'm also pasting it here:
"
var myUser = UserManager.GetManager().GetUser(
"Emina"
);
var posts = App.WorkWith().BlogPosts().Where(bP => bP.Owner == myUser.Id).Delete().SaveChanges();
Please do not hesitate to get back to us if any of the problems persist, or you have some additional questions. Thanks in advance.
Regards,
Boyan Barnev
the Telerik team
Thanks Boyan, while I wasn't able to user your solution directly (I can't in good conscious deleted 180 pieces of content the client has written over the past four months) it did gave me a good starting-point in resolving the issue. By querying the database directly I was able to reconcile the posts where it was referencing images/media that no longer existed within the database.
The following script parses out the GUID of the first image found (if any) in each of the Blog Posts by that specific user. Using the list of GUIDs I went through all existing posts and edited thier content to remove the reference to the invalid image. Finally (and I'm not sure why this had to be done) I ran and additional SQL statement to remove the additional query-string parameter '?Status=Master' that was sometimes appended to the end of image references. The end result: Indexing blog post content works!
declare
@OwnerID uniqueidentifier;
set
@OwnerID =
'C4C9223B-EB11-401D-A5BB-4010CB55BC0B'
;
declare
@Pattern nvarchar(100);
set
@Pattern =
'%src="_images%'
;
declare
@Media
table
(
id uniqueidentifier
);
insert
into
@Media
select
cast
(
upper
(
substring
(content_, patindex(@Pattern, content_) + len(@Pattern) - 1, 36))
as
uniqueidentifier)
from
sf_blog_posts
where
ownr = @OwnerID
and
patindex(@Pattern, content_) > 0
order
by
date_created
desc
;
select
count
(
distinct
(id))
from
@Media;
select
distinct
(id)
from
@Media
where
id
not
in
(
select
content_id
from
dbo.sf_media_content
);
Hello Anthony,
I am really glad to hear that the issue you were facing is now resolved. I totally agree with you that the solution you have implemented is much more tolerable to preserving user content, and want to express my gratitude for sharing it with us and the community. It was a pleasure for me to help you, please do not hesitate to write back if any additional questions come up.
Regards,
Boyan Barnev
the Telerik team