NotesDominoQuery – Find the needle in the haystack (LS)

Think about a big database with lots of documents in it and you want to find only one particular document. You can do that with FTSearch, or you can use a db.search with some formula.

As of V10.0.x, you also have DQL and the new NotesDominoQuery class.
The class is available in Lotsscript and Java.
I want to demonstrate in this sample, how you can find the needle in the haystack with DQL in Lotusscript.

My database has about 12.500.000 Documents. It is one of our customers database at midpoints. The amount of documents was created by accident. Some call it a bug. Anyway, the database is a good playground.

The code is typical for a LS developer. It initiates objects and stuff, assigns variables like our dqlTerm (line 6 see the similarity to the @formula, you would probably use with db.search? ), does a check, if the target database is open (line 9 ) and also has some basic error handling (17, 20-21). We will come to that later.

I am running the sample on the client. As of today , you cannot run a query client / server. I will show in another post, how you can run the query on the server and work with the results on the client. But that is another story.

public Sub foo() 
	Dim session As New NotesSession
	Dim db As NotesDatabase
	Dim col As NotesDocumentCollection
	Dim dqlTerm As String
	dqlTerm = "form = 'frm.rules.device.rule' And rule_unid = '99A242AAB69B5BB9C1257FFC005DE6C4'"
	
	Set db = session.Getdatabase("","trul-big.nsf", False)
	If db.Isopen Then
		
		Dim dql As NOTESDOMINOQUERY
		Set dql = db.CreateDominoQuery()
		
		Dim parse_result As String
		parse_result = dql.parse(dqlTerm)

		If LCase(parse_result) = "success" Then
			Set col = dql.Execute(dqlTerm)
			MsgBox dql.Explain(dqlTerm)
		Else
			MsgBox parse_result
		End If
		
	End If
End Sub

Before you run the code and try to search such a huge amount of data, you need to set some notes.ini variables.

QUERY_MAX_DOCS_SCANNED=13000000 
QUERY_MAX_VIEW_ENTRIES_SCANNED=13000000

By now, there is no other way to increase the number of documents. There are setters in the NotesDominoQuery class, but those setters are broken. This is a known issue and HCL is working on a solution.

When we now run the code, we will get the following result

It took 51.3 secs to find 110642 documents that use the form “frm.rules.device.rule” and zero to none secs to find 1 document in that resultset.

Would it be faster or slower, if we modify our query to use only the rule_unid field and search over the entire set of documents?

dqlTerm = "rule_unid = '99A242AAB69B5BB9C1257FFC005DE6C4'"

Here is the result

32.2 secs to find the needle in the haystack. Use this kind of single field search, if you are sure that the field “rule_unid” is only used on one form.

But even in the case that the field is used on another form; to filter the resulting NotesDocument collection afterwards is much faster than the AND dqlTerm from the original code.

Maybe, you already have a view in your application where the rule_unid is in a sorted column.

Then we can use a modified dqlTerm to find the document in that view.

Again, let’s change our dqlTerm a little bit

dqlTerm = "'all'.rule_unid = '99A242AAB69B5BB9C1257FFC005DE6C4'"

And here is the result

THAT is pretty cool, isn’t it? 5.6 msecs to find the needle in the haystack.

Can it be even faster? I think no, but let us take another look at the code that does the query.

        Dim parse_result As String
        parse_result = dql.parse(dqlTerm)
 
        If LCase(parse_result) = "success" Then
            Set col = dql.Execute(dqlTerm)
            MsgBox dql.Explain(dqlTerm)
        Else
            MsgBox parse_result
        End If

We can safely remove all of our “Error handling”, that means lines 14-17 and 20-22. Why? Because dql.Excecute(dqlTerm) does the parse before executing the query.

So we end up with the following code

public Sub foo() 
	Dim session As New NotesSession
	Dim db As NotesDatabase
	Dim col As NotesDocumentCollection
	Dim dqlTerm As String
	
	dqlTerm = "'all'.rule_unid = '99A242AAB69B5BB9C1257FFC005DE6C4'"

	Set db = session.Getdatabase("","trul-big.nsf", False)
	If db.Isopen Then
		Dim dql As NOTESDOMINOQUERY
		Set dql = db.CreateDominoQuery()
		Set col = dql.Execute(dqlTerm)
		MsgBox dql.Explain(dqlTerm)
	End If
End Sub

When we run the code, we get

Keep in mind that the numbers vary a litte bit from run to run.

If we now modify our dqlTerm and add an error, we are shown a nice dialog box explaining the error in detail.

That’s all for today, I hope you find this information useful.