5 Replies Latest reply on Oct 31, 2013 6:22 PM by Robert Maillet

    SQL Query to return all documents with a given search string, across all variables, latest only.

    Robert Maillet

      I've been working on a SQL Query that will function as a robust keyword search. I was hoping to get some help in looking at this Query to simplify it. I think it's too complicated, but is the closest to achieving my desired results. Currently seems to return same results as EPDM Search, I think lol.

       

      Assumptions:

       

      1) I don't know what field to search on, a search card is linked to individual variables. I want a google style box. I know there is something similar in EPDM.

      2) I want my search criteria to span all variables, all configurations, return the Variables where the search criteria was found in.

      3) return a filename/documentID only if that search criteria is used in the latest version of that file.

      4) Tables Variablevalue,Documents,Variable.

       

      This is more of a time filler exercise, final output will probably be a Google style company web-app. I drove myself crazy with Joins, but finally went with storing a DataSet 1st then searching that DS as it held only my master search results.

       

      USE <VaultName>

      ;WITH DS AS (SELECT DocumentID,VariableID,ValueText,RevisionNo FROM VariableValue WHERE ValueText LIKE '%<SearchME>%')

       

      SELECT DS.*,V.RevisionNo,D.FileName,R.VariableName FROM DS,VariableValue V,Documents D,Variable R

      WHERE V.DocumentID=DS.DocumentID AND V.VariableID=DS.VariableID AND

      V.DocumentID=d.DocumentID AND V.VariableID=R.VariableID AND

      DS.RevisionNo=V.RevisionNo AND

      DS.RevisionNo=(SELECT MAX(Vr.RevisionNO) FROM VariableValue Vr WHERE Vr.DocumentID=DS.DocumentID AND Vr.VariableID=DS.VariableID)

      ORDER BY V.VariableID,V.DocumentID

       

      My DB is not huge, so it's hard for me to see how this query will perform. If anyone has a query that does this and wants to share, please do. I tend to over complicate things.

        • Re: SQL Query to return all documents with a given search string, across all variables, latest only.
          Robert Maillet

          Above Query only works if you enter one search criteria. That's not very helpful. This updated Query builds a concatinated field of all variables that match a criteria (with the same documentID). I suspect the performance of this query will not be that great in a year or two.

           

          What is the best out of the box method of providing only a single text box for users to enter key words and have EPDM search ALL VARIABLES for them with an AND condition. A user may want to search on a workstate, description, user etc...

           

          For completeness... this Query works well. It runs with an Add-In that displays a results dialog box. A user double clicks a result and launches explorer with the file selected (Browse To Functionality).

           

           

          ;WITH DS AS (SELECT DocumentID,

            STUFF((SELECT DISTINCT ', ' + CAST(ValueText as varchar(6000))

                     FROM VariableValue T2

                     WHERE T2.DocumentID = T1.DocumentID AND (ValueText LIKE '%WELDMENT%' OR ValueText LIKE '%RM%')

                     FOR XML PATH('')),1,1,'') AS AllValue

          from VariableValue T1 group by DocumentID)

           

          SELECT D.FileName,DS.* FROM Documents D, DS WHERE

          DS.DocumentID=D.DocumentID AND

          DS.AllValue IS NOT NULL AND DS.AllValue LIKE '%WELDMENT%' AND DS.AllValue LIKE '%RM%'

            • Re: SQL Query to return all documents with a given search string, across all variables, latest only.
              Charley Saint

              Robert,

               

              You might want to play around with this:

               

               

              with ds as (
              select documentid, max(RevisionNo) as RevisionNo, VariableID, ConfigurationID from
              VariableValue vv 
              where (IsLongText = 0 and ValueCache like '%a%')
              or (IsLongText = 1 and ValueText like '%a%')
              group by DocumentID, VariableID, ConfigurationID
              )
              
              
              select filename, variablename, valuetext
              from Documents d 
              inner join ds on d.DocumentID = ds.DocumentID
              inner join VariableValue vv on vv.DocumentID = ds.DocumentID
                        and vv.RevisionNo = ds.RevisionNo and vv.VariableID = ds.VariableID
                        and vv.ConfigurationID = ds.ConfigurationID
              inner join Variable v on v.VariableID = ds.VariableID
              
              
              

               

              That uses a bit of the logic built into EPDM to speed things up. If a variable value is < 255 it's stored in valuecache, if it's larger it's stored in valuetext, which searches much slower

                • Re: SQL Query to return all documents with a given search string, across all variables, latest only.
                  Robert Maillet

                  Thanks Charley for your Query, it is cleaner and more robust. But it makes the same mistake I made originally, when searching on multiple keywords.

                   

                  Let me try to explain.

                   

                  Your query, like my first one, will fail once you try multiple (2 or more) search words since the search words may exist for the same documentIDs but from different variables, which will place them in seperate SQL rows.

                   

                  ie a part 123.sldprt may have VARIABLES Description="WELDMENT" & TITLE2="TESTING COMPLETED" and if the user were to search on "weldment test" your query, and my original one would return empty. If the part 123.sldprt were to have Description="WELDMENT TESTING" then your query will work.

                   

                  The only thing I can think of is to concatinate all of the values (ValueText or ValueCashe - thx for the tip BTW) into a field called Allvalue and then run a query with an AND against that single field. This has performance issues written all over it since the original concatination DataSet is created using every single documentID (SELECT DISTINCT). So in a table with 9000 rows, there are 580 Distinct Documents... and the DB is 4-weeks old.

                   

                  I may be taking this a little too seriously, but I'm finding the EPDM search capability to be good, but not great. I want to be able to do a index style search across all variables but not a real index search into the document contents etc... Search cards work well enough, but always looking for more.

                    • Re: SQL Query to return all documents with a given search string, across all variables, latest only.
                      Charley Saint

                      Robert,

                       

                      What if you use the add-in to generate the query as follows:

                       

                      for weldment use:

                       

                       

                      with ds as (
                      select documentid, max(RevisionNo) as RevisionNo, VariableID, ConfigurationID from
                      VariableValue vv 
                      where (IsLongText = 0 and ValueCache like '%weldment%')
                      or (IsLongText = 1 and ValueText like '%weldment%')
                      group by DocumentID, VariableID, ConfigurationID
                      )
                      

                       

                      and for weldment test use:

                       

                       

                      with ds as (
                      select documentid, max(RevisionNo) as RevisionNo, VariableID, ConfigurationID from
                      VariableValue vv 
                      where (IsLongText = 0 and ValueCache like '%weldment%')
                      or (IsLongText = 1 and ValueText like '%weldment%')
                      or (IsLongText = 0 and ValueCache like '%test%')
                      or (IsLongText = 1 and ValueText like '%test%')
                      group by DocumentID, VariableID, ConfigurationID
                      )
                      

                       

                      and for "weldment test" use:

                       

                      with ds as (
                      select documentid, max(RevisionNo) as RevisionNo, VariableID, ConfigurationID from
                      VariableValue vv 
                      where (IsLongText = 0 and ValueCache like '%weldment test%')
                      or (IsLongText = 1 and ValueText like '%weldment test%')
                      group by DocumentID, VariableID, ConfigurationID
                      )
                      
                      
                        • Re: SQL Query to return all documents with a given search string, across all variables, latest only.
                          Robert Maillet

                          /* this query is kind of cool. Google like single text box */

                          CREATE TABLE #dummy(DocumentID int,VariableID int,Rev int);

                           

                          /* get all records with the given keywords using OR to pick up any records */

                          /* max revisions for given documentid,variableid,revision no. */

                          ;with ds as  (select distinct variableid, MAX(revisionno)as rev,documentid,valuetext from VariableValue

                          where (ValueText like '%weldment%' or ValueText like '%rm%')  group by DocumentID,valuetext,VariableID)

                           

                          /* the above query may have picked up extra records because of the OR clause*/

                          /* it is possible that each keyword existed in the same variableid with the max(revisionno) of each result - will return the */

                          /* max rev of each key word. outer join vs. inner join stuff */

                          /* eliminate duplicate rows with a revision that is not the latest*/

                          insert into #dummy (DocumentID,VariableID,Rev)

                          select distinct ds.DocumentID,ds.VariableID,MAX(ds.rev) as Rev from ds group by ds.DocumentID,ds.VariableID

                           

                          create table #allvals(DocumentID int,AllValues nvarchar(max));

                           

                          /* add back the valuetext field. Should use the Cashe value for performance... must match the documentid/revisionno/variableid PKs */

                          /* my life would be so much easier if this table was normalized properly... ie should have a ID field VariableValueID */

                          /* don't even want to know what I need to do if getting different configurations. search all configurations with this code */

                          insert into #allvals (DocumentID,AllValues)

                          select #dummy.DocumentID,v.ValueText FROM VariableValue V,#dummy WHERE v.DocumentID=#dummy.DocumentID and v.RevisionNo=#dummy.Rev and

                          v.VariableID=#dummy.VariableID

                           

                          drop table #dummy

                           

                          /* make a final table (3rd table) that has concatinated values for the valuetext field, with the documentids found in the above sql */

                          create table #finalvals(DocumentID int,AllValues nvarchar(max));

                          insert into #finalvals(DocumentID,AllValues)

                           

                          select distinct DocumentID,STUFF(

                          (select ',' + cast(AllValues as nvarchar(max))

                          FROM #allvals A WHERE A.documentid=B.documentid

                          FOR XML PATH('')),1,0,'') AS CatValue  FROM #allvals B

                           

                          drop table #allvals

                           

                          /* final search and join on the catinated table, using an AND, to select filename with only the keywords listed at latest versions */

                          select D.Filename,fv.* from documents D,#finalvals fv where

                          d.DocumentID=fv.DocumentID and

                          AllValues like '%weldment%' and AllValues like '%rm%' order by DocumentID

                           

                          drop table #finalvals

                           

                          /*--------------------------------------------

                          make a function that build up the AND/OR values...

                          VB.NET Code to repalce the bolded underlined SQL code above.

                          Private Function BuildSearchConditions(VarList As String, Optional Join As String = "AND", Optional Variable As String = "ValueText") As String

                                  Dim BuildString As String = ""

                                  Dim pos As Integer = 0

                                  Dim Str As String = VarList

                           

                                  If Str = "" Then Return ""

                           

                                  BuildString = Variable & " LIKE '%"

                           

                                  Do While True

                                      pos = 0

                                      pos = InStr(Str, " ")

                                      If pos = 0 Then

                                          BuildString = BuildString & Str & "%'"

                                          Exit Do

                                      End If

                           

                                      BuildString = BuildString & Left(Str, pos - 1) & "%' " & Join & " " & Variable & " LIKE '%"

                           

                                      Str = Right(Str, Len(Str) - pos)

                                  Loop

                           

                                  Return BuildString

                          End Function

                          */