Prevent search index to index certain files
How do you go about it, when you have intranet documents and don't want them to be indexed.
I have created a document files library and named it intranet. Then I changed permissions to certain roles to be able to see/access files. Some permissions on the page where the documents will be shown.
Now the documents get indext and an extract shown in the search results.
Of course they can not be access because of the permisson settings. However some part of the files is still spit out in the search results settings.
So the question is - how to you make a search index permisson aware?
Markus
PS: Anyhone else having the problem that responses in the forum do not show up?
Anyone?
Hello,
At the moment the only way to achieve this would be to implement a custom pipe to ignore documents either by permission or by library. At the moment we do not have a sample for this but we are working on it and it will be made available via a KB or a Blog post shortly after the release. A better interface and settings for this scenario are planned for implementation and should be available in a future release.
I apologize for the inconvenience.
Hi Atanas
Sure is not nice to have Sitefinity display content publicly that is restricted to some users :-(
A long time ago in a galaxy far away I made a feature request for pages an editional checkbox in Titles and Properties where you could exclude a page form internal search engine.
So one for externen (noindex)
and one for internal
http://www.telerik.com/support/pits.aspx#/public/sitefinity/5274
Maybe if this feature would ever made it to any roadmap it could be extended for documents.
Markus
Hello Markus,
Your feedback has been noted. I will do my best to bump the PITS up the priority list for 5.3 and also to include the options for Document search.
All the best,Dear Atanas
Thank's for the feedback.
Markus
Dear Atanas
Could you say if this has made it to the 5.3 list?
To me its a huge security flaw if you set a library not to have special access, but search indexes all doc and even displays part of it on the search result list.
Markus
Hello,
I do not have an update on this, that is why we started working on a sample that will give you the ability to exclude a certain library from the index. That way the documents uploaded to it will be safe. Once we are done I will post it here.
Regards,Hi,
Could someone please post an update on this problem? Did it get addressed in version 5.3? We're running 5.2, and showing search results of restricted content is a huge issue at the moment. It's to the point where I'm strongly considering replacing Sitefinity search with Google's custom search.
Thanks,
Kevin
Dear Kevin
I have not heard anything on this which means at a 99.9% chance it will be like this on 5.4.
I had to stop indexing documents because I had 3 resticted documents that would show up with extract of the document on search results. of course they were not accassable.
I had once mad a feature request to add a second field http://www.telerik.com/support/pits.aspx#/public/sitefinity/5274 to prevent pages from beeing indexed internaly and/or externaly.
Vote on it get friends to vote, keep asking telerik to implment it, make some noise on the forum, be patiant and you will get it :_)
Markus
@Telerik
Any news on this. So far as soon as you have 1 single
document that should be restricted you cannot index documents because that ONE
single document gets indexed as well.
While you cannot open it, it will still show an extract in
search results.
So basically at the moment out-of-the box you cannot use
search for documents if you want to restrict access to any document - unless
you don't care if some of the content is revealed to everybody.
Markus
Hi Markus,
You can try modifying the solution in this blog post: Trim the Search Results based on permissions / roles. You should be able to extend it to work for documents. If you set the permission for that document to visible only by Administrators/Authenticated users, you will be able to omit it from the search results, avoiding exposure non public information.
This solution should work until this functionality is implemented out of the box.
Sorry
My first post got all messed up.
a) thanks for the link to the workaround
b) no thanks don't want to fix every solution deployed with a workaround
c) performance wise it would make more sense to me to do prevent pages from beeing index when indexing and not when displaying results
One of the top 10 PITS feature request http://www.telerik.com/support/pits.aspx#/details/Issue=5274
Markus
PS:
should have status in progresse http://www.telerik.com/support/pits.aspx#/details/Issue=5841
might have status closed since margins are back (borders needed?): http://www.telerik.com/support/pits.aspx#/details/Issue=9375
@Atanas,
Until everything is ironed out, would it be possible for Sitefinity to implement at least a fix.
Technically 'downloadable-goods' might be a document library, but I'm fairly certain there's no one out there that wants purchasable documents being exposed through search.
---
Search results won't link to a downloadable-good if you're not allowed to access that file, but still the search snippet exposes info. Imagine for instance a file called serial.txt containing license keys.
---
If you guys could spend 45mins to squeeze in a check so search won't index that particular library, it would be greatly appreciated.
Jochem
Hi guys,
I completely agree with you. The big 6.0 release is on the final stretch and the team is working hard. This is valid and valuable feature to have. Once the team focuses on PITS items after the 6 release, this one will have a high priority. I cannot give you any estimations at this point. I will post any updates here as I get them.
@Atanas,
Thanks for the swift response, but I politely disagree with your assessment of its importance.
Purchasable content should never be publicly exposed, it's not a matter of permission based indexing, it's a matter of downloadable-goods not ever should have been made part of documents-library.
It's a bug and a serious one that can (and has) directly cost clients money.
Surely someone in the next 10-15 days has 45mins to spare to exclude 'downloadable-goods' from the list of that gets send to lucene to index...
[quote]Atanas Valchev said:
The big 6.0 release is on the final stretch and the team is working hard. This is valid and valuable feature to have. Once the team focuses on PITS items after the 6 release, this one will have a high priority. I cannot give you any estimations at this point. I will post any updates here as I get them.
@Kevin
I agree and understand your frustration on Telerik sometimes having different priorities that we and our clients have.
BUT
There simply is no other CMS usability wise like Sitefinity and that makes it worth waiting for 6.5 for some stuff :-)
Markus
Hello,
I completely understand your frustration. The team is working very hard to include all client requested improvements. However during development and planning the dynamics and priorities change. Your feedback is always kept in mind when discussing new additions or improving existing ones.
Greetings,Any news on that: www.telerik.com/.../pits.aspx
Markus
Hi Markus,
There is nothing new I can share regarding this PITS at this point. I post an update when the team has started working on it.
All the best,@Telerik
More then 2 years old
More then 50 votes
www.telerik.com/.../pits.aspx
How about putting this on some road map. Even if its 6.4
Markus
Hello Markus,
At the moment the best solution is to use the updated sample in this blog post:Sitefinity Publishing System: A Brief Walkthrough which includes a sample that will exclude documents in a certain library from the index. As for the feature itself, it has not been planned yet. If this changes, I will update the info here.