Compare & Highlight Large Strings

taintedSoul

New Member
Joined
Mar 17, 2011
Messages
1
I am attempting to build an automated spider to compare two versions of the 'same' web page by extracting the HTML Source code of each URL (old & new). There should be no visual differences, but there may be some differences between hidden meta data and URLs, so I don't want to do a 1:1 comparison, but would like to somehow identify either the number of differences, or somehow skip the known differences (server names, domains, other minor changes, etc).

I have the code to store the HTML into string objects that are ready to compare:

Code:
strPage1 = GetPageHTML(strFullURL, intTimeout)
strPage 2 = GetPageHTML(strFullURL, intTimeout)

I would like to compare the differences between strPage1 and strPage2 but keep comparing after encountering the first difference and output the # of discrepancies to column (E) on my worksheet. Any ideas?

Example of strPage1:
Code:
<html>
<head>
< meta name="server" content="server01">
< meta name="domain" content="google.com">
< meta type="text/javascript" src="http://google.com/file1.js"></script>
</head>
<body>
<p><a href="http://google.com/searchResults/test.html">Test URL</a></p>
</body>
</html>

Example of strPage2:

Code:
<html>
<head>
< meta name="server" content="server02">
< meta name="domain" content="bing.com">
< meta type="text/javascript" src="http://bing.com/file1.js"></script>
</head>
<body>
<p><a href="http://bing.com/searchResults/test.html">Test URL</a></p>
</body>
</html>

I would want this to either return "4" for 4 differences, or somehow highlight the various differences, or even a % of how different the two strings are from each other.

I wasn't able to find anything similar to this by simply googling, but if there is something already out there I apologize.

Thanks in advance for any advice! Other suggestions for automating this same type of thing are definitely welcome.
 

Excel Facts

Quick Sum
Select a range of cells. The total appears in bottom right of Excel screen. Right-click total to add Max, Min, Count, Average.
This seems like a good candidate for conditional formatting. If you copy and paste your first sample code strpage1 into an Excel sheet, it would occupy 10 rows. I know it's only an example, but let's say it occupies a hundred rows or a thousand rows, or ten thousand rows. Whatever the case, skip a few lines and paste in the next (strPage2) block of code. Conditionally format the first cell if the second (strpage2) code with the formula rule =A10002 < > A1, and select a format color to identify the cells where differences lie. Copy cell A10002 and paste special for formats to the rest of the strPage2 used range. I used cell 10002 in the hypothetical example of your first strPage1 code being then thousand rows.

Maybe I am missing something obvious but that's my first impression of a way to go about obtaining the result you seem to be asking for.
 
Upvote 0

Forum statistics

Threads
1,214,652
Messages
6,120,746
Members
448,989
Latest member
mariah3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top