Re-Post // Pulling HTML Header Data From IMDB

BrianExcel

Well-known Member
Joined
Apr 21, 2010
Messages
975
A fellow forum member told me that when he responded the thread shut down and wouldn't allow responses or edits - so I have to re-post (sorry moderators, I don't ever re-post but I don't have much of an option on this one if the original thread won't work).

====== Start Re-Post========

Good morning folks,

Right now I am using the following code that opens an Internet Explorer Browser window and navigates to a specific title on IMDB using the movie code (such as "tt0118531" for the movie "One Eight Seven").

Code:
[/COLOR][COLOR=#333333]Public Sub NavigateToURL(ByVal argURL As String)[/COLOR]

  Const READYSTATE_COMPLETE As Integer = 4  Dim objIE As Object    Set objIE = CreateObject("InternetExplorer.Application")  With objIE    .Visible = True    .Silent = True    .Navigate argURL    Do Until .ReadyState = READYSTATE_COMPLETE      DoEvents    Loop  End With  'objIE.Quit  Set objIE = Nothing0 </pre>[COLOR=#333333]End Sub[/COLOR][COLOR=#333333]

This code works fine - but now I need to add a couple more steps.

1. I need to pull the title "One Eight Seven" from the HTML of the IMDB page. I have already located it in the HTML code, shown below:

(open code tag "title")One Eight Seven (1997) - IMDb(closing code tag "/title")

I need to know how to pull that title into column B - column A contains the movie code I referenced above.

2. The next step, is there is an image on that page of the movie cover box. I want to save that image to a folder of my choosing, and name the file as the title that we pulled in step 1.

I was unable to find where the specific image code is, so I am hoping someone can help me with that. Any thoughts from anyone?

====== End Re-Post========
 

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
Ok no breaking it this time I promise!!!


For the first part
You need to remove the # signs from below:

Code:
Function GetMovieTitle(imdbTag As String) As String

With CreateObject("WinHttp.WinHttpRequest.5.1")
    .Open "GET", "http://www.imdb.com/title/" & imdbTag, False
    .send
    GetMovieTitle = Split(Split(.responsetext, "<#title#>")(1), "<#/")(0)
End With


End Function

This will work as a worksheet function so =GetMovieTitle(A1) dragged down - note that it uses the imdb reference rather than the url so feel free to adapt if that's what you've got.

For the second part, do you have ie9 installed?
 
Upvote 0
Thanks a million Kyle.

I have IE 8. I am on a corporate computer and can't install 9 yet per our internal rules.
 
Upvote 0
you sure you don't want the pictures directly in the worksheet? You want them in a folder?
 
Upvote 0
I would prefer them in a folder as separate files. I own all the movies I am looking and am importing them into iTunes to digitize my collection, but when I rip the file it doesn't contain any image so I want to get one - without downloading a 1,000 pictures individually.

Sorry if that causes more of a pain - I can try and download IE9 on my home computer and run it from there if it makes it a little easier...
 
Upvote 0
No probs:

Again, you'll need to remove the # signs, it's a little hacky since you don't have IE9, but it works for the ones I've tested

It will break when the movie title has characters in that aren't valid in filenames - so you'll want to check for that :) - I haven't in the below, again it will work as a worksheet function - as a replacement for the earlier code.

Code:
Public Function GetMovieTitle(imdbTag As String) As String


Const MyPicPath As String = "C:\SomePath\"
Dim MovieTitle As String
Dim imageUrl As String
Dim b() As Byte




With CreateObject("WinHttp.WinHTTPRequest.5.1")
    .Open "GET", "http://www.imdb.com/title/" & imdbTag, False
    .send
    imageUrl = Split(Split(Split(.responseText, "img_primary")(1), "src=""")(1), """")(0)
    MovieTitle = Split(Split(.responseText, "<#title#>")(1), "<#/")(0)
    .Open "GET", imageUrl, False
    .send
    b = .responseBody
 End With
 
 With CreateObject("ADODB.Stream")
    .Type = 1
    .Open
    .Write b
    .SaveToFile MyPicPath & MovieTitle & ".jpg"
    .Close
End With


GetMovieTitle = MovieTitle


End Function
 
Last edited:
Upvote 0
Kyle,

Thanks SO much!

Sorry it took me so long to get back to you - I just had the opportunity to check this, and it works PERFECTLY.

Thanks again!
 
Upvote 0
Kyle -

One additional question. I put this code into a button, so I could add a step or two. Could you possibly modify the code you provided to do one more thing?

The picture files are writing perfectly, and I see what you mean about it breaking when a character produces a bad name. The one last thing I need to do, is add an extra condition that looks in the folder the pictures are being saved to as the code is running, and checks for the name.

For example, I have both the cartoon and live version of 101 Dalmations - but when I try and run the code, it stops at the second 101 Dalmations, and won't continue because the file already exists.

Could you help me out with some code that says...

If the File doesn't exist, create it
ELSE
If the file DOES exist, create it, but with (1) at the end, so it creates a unique file name



I have been trying to get this all day - but no luck. I need help!
 
Upvote 0
It took a while to figure out, but I finally got it...

Code:
If Len(Dir(MyPicPath & MovieTitle2 & ".jpg")) = 0 Then
    With CreateObject("ADODB.Stream")
        .Type = 1
        .Open
        .Write b
        .SaveToFile MyPicPath & MovieTitle2 & ".jpg"
        .Close
    End With
Else
    With CreateObject("ADODB.Stream")
        .Type = 1
        .Open
        .Write b
        .SaveToFile MyPicPath & MovieTitle2 & " (1)" & ".jpg"
        .Close
    End With
End If
 
Upvote 0

Forum statistics

Threads
1,215,334
Messages
6,124,319
Members
449,153
Latest member
JazzSingerNL

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top