Search engine for XML documents

One of the entries that has been collecting dust in the “draft” folder for this blog was about how it would be nice to have a search engine for XML documents. So, when the announcement of Google Code Search came out, I thought it was finally done and I could delete the never-published entry. Well, turns out it doesn’t support searching on XML documents. I don’t care to debate whether XML (or some XML dialects) is code or not, all I know is that it would be very nice to be able to do things such as:

  • look for instances of a specific GED
  • compare how often different XSD constructs are used (choice, sequence…)
  • look for all wsdl:binding elements that implement a given portType
  • look for all wsdl:port elements and all the WS-A EPRs that have an address in the domain
  • look for all XML documents for which a given XPath query evaluates to “true”
  • look at the entire Web (or a subset of it) as one giant SML model and query it
  • even for good old HTML/XHTML documents, it would be nice to search them as XML documents and be able to look for pages that contain a certain string as part of the title element or as part of a list.

In the meantime, people are going to have fun searching for password embedded in source code and other vulnerabilities.

Comments Off on Search engine for XML documents

Filed under Everything, Tech

Comments are closed.